[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..

IMPALA-8891: Fix non-standard null handling in concat_ws()

This patch fixes the non-standard null handling logic for
function 'concat_ws', while maintaining the original null
handling for function 'concat'

Existing statuses:
For function concat_ws, any null string element in array
argument 'strs' will result in null result, just like below:

select concat_ws('-','foo',null,'bar') as expr1;
+---+
| expr1 |
+---+
| NULL  |
+---+

New Statuses:
In this implementation, the function conforms to hive standard:
1.will join all the non-null string objects as the result
2.if all string objects are null, return empty string
3.if separator is null, return null
below is a example:
-
select concat_ws('-','foo',null,'bar') as expr1;
+--+
|  expr1   |
+--+
| foo-bar  |
+--+


Key changes:
* Reimplement function StringFunctions::ConcatWs by filtering the
  null value and only process the valid string values, based on
  original code structure.
* StringFunctions::Concat was also reimplemented, as it used to
  call ConcatWs but should keep the original NULL handling.

Testing:
* Ran exaustive tests.

Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Reviewed-on: http://gerrit.cloudera.org:8080/14885
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
2 files changed, 77 insertions(+), 21 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 13
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 12: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 12
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 07:11:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..

IMPALA-4192: Move static state from ExecNode into a PlanNode

This patch adds a new class called PlanNode which contains a subset
of the static state  of their corresponding ExecNode, of which there
is one instance per fragment. ExecNode contains the runtime state
and there can be up to MT_DOP instances of it per fragment.

It also adds a similar class called AggregatorConfig which contains
static state initialized from the thrift aggregator struct and is
passed as an input to the Aggregator class's constructor.

Eventually all static state including codegened function pointers
would be moved to the PlanNodes.

Testing:
Ran exhaustive tests successfully.

Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Reviewed-on: http://gerrit.cloudera.org:8080/14764
Reviewed-by: Bikramjeet Vig 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/aggregation-node-base.cc
M be/src/exec/aggregation-node-base.h
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/aggregator.cc
M be/src/exec/aggregator.h
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/cardinality-check-node.cc
M be/src/exec/cardinality-check-node.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/empty-set-node.cc
M be/src/exec/empty-set-node.h
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node-mt.cc
M be/src/exec/kudu-scan-node-mt.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/non-grouping-aggregator.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/exec/select-node.cc
M be/src/exec/select-node.h
M be/src/exec/singular-row-src-node.cc
M be/src/exec/singular-row-src-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/streaming-aggregation-node.cc
M be/src/exec/streaming-aggregation-node.h
M be/src/exec/subplan-node.cc
M be/src/exec/subplan-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
M be/src/exec/unnest-node.cc
M be/src/exec/unnest-node.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
64 files changed, 1,338 insertions(+), 700 deletions(-)

Approvals:
  Bikramjeet Vig: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 14
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..


Patch Set 13: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 05:41:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14705 )

Change subject: Added support of Kudu DATE type to Impala
..


Patch Set 4:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/5299/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 4
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 
Gerrit-Comment-Date: Tue, 17 Dec 2019 05:20:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14705 )

Change subject: Added support of Kudu DATE type to Impala
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14705/4/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/14705/4/tests/query_test/test_kudu.py@437
PS4, Line 437: ,
flake8: E501 line too long (99 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 4
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 
Gerrit-Comment-Date: Tue, 17 Dec 2019 04:59:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Volodymyr Verovkin (Code Review)
Volodymyr Verovkin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14705 )

Change subject: Added support of Kudu DATE type to Impala
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14705/1/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_create.test:

http://gerrit.cloudera.org:8080/#/c/14705/1/testdata/workloads/functional-query/queries/QueryTest/kudu_create.test@155
PS1, Line 155: date
> Do you plan to add that new test in this patchset or in a separate one?
Done



--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 4
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 
Gerrit-Comment-Date: Tue, 17 Dec 2019 04:59:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Volodymyr Verovkin (Code Review)
Hello Thomas Tauber-Marshall, Alexey Serbin, Adar Dembo, Grant Henke, Tim 
Armstrong, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14705

to look at the new patch set (#4).

Change subject: Added support of Kudu DATE type to Impala
..

Added support of Kudu DATE type to Impala

Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
---
M be/src/exec/kudu-util.cc
M fe/src/main/java/org/apache/impala/util/KuduUtil.java
M fe/src/test/java/org/apache/impala/analysis/AnalyzeKuduDDLTest.java
M testdata/workloads/functional-query/queries/QueryTest/kudu_create.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_delete.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_describe.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_update.test
M testdata/workloads/functional-query/queries/QueryTest/kudu_upsert.test
M tests/query_test/test_kudu.py
10 files changed, 649 insertions(+), 570 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/14705/4
--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 4
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..

Fix handling of 'Connection' header in squeasel

This patch bumps Impala's version of squeasel to
030ccce87359d892e22fb368c5fc5b75d9a2a5f7
in order to pull in a fix for an issue where squeasel was improperly
closing connections when the 'Connection: Upgrade' header was included
in the request, which Apache Knox will include in certain situations.

See https://github.com/cloudera/squeasel/pull/16

Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Reviewed-on: http://gerrit.cloudera.org:8080/14915
Reviewed-by: Adar Dembo 
Reviewed-by: Tim Armstrong 
Tested-by: Impala Public Jenkins 
---
M be/src/thirdparty/squeasel/squeasel.c
1 file changed, 21 insertions(+), 10 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 04:34:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5298/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 17 Dec 2019 04:06:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped

2019-12-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14916 )

Change subject: IMPALA-9257: Last event id should be advanced if all events are 
skipped
..


Patch Set 1:

(1 comment)

Nice find!

http://gerrit.cloudera.org:8080/#/c/14916/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
File 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java:

http://gerrit.cloudera.org:8080/#/c/14916/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java@593
PS1, Line 593: if (events.isEmpty()) return;
Should we still update EVENTS_RECEIVED_METRIC in this case?



--
To view, visit http://gerrit.cloudera.org:8080/14916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32
Gerrit-Change-Number: 14916
Gerrit-PatchSet: 1
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Dec 2019 03:45:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2019-12-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 7:

Fix issues for CreateView and AlterView. Also add test for Union statements.


--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 17 Dec 2019 03:36:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2019-12-16 Thread Quanlong Huang (Code Review)
Hello Fang-Yu Rao, Vihang Karajgaonkar, Kurt Deschler, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14894

to look at the new patch set (#7).

Change subject: IMPALA-9009: Core support for Ranger column masking
..

IMPALA-9009: Core support for Ranger column masking

Ranger provides column masking policies about how to show masked values
to specific users when reading specific columns. This patch add support
to rewrite the query AST based on column masking policies.

We perform the column masking policies by replacing the TableRef with a
subquery doing the masking. For instance, the following query
  select c_id, c_name from customer c join orders on c_id = o_cid
will be transfomed into
  select c_id, c_name  from (
select mask1(c_id) as c_id, mask2(c_name) as c_name from customer
  ) c
  join orders
  on c_id = o_cid

The transfomation is done in AST resolution. Just like view resolution,
if the table needs masking we replace it with a subquery(InlineViewRef)
containing the masking expressions.

This patch only adds support for mask types that don't require builtin
mask functions. So currently supported masking types are MASK_NULL and
CUSTOM.

Current Limitations:
 - Users are required to have privileges on all columns of a masked
   table(IMPALA-9223), since the table mask subquery contains all the
   columns.

Tests:
 - Add e2e tests for masked results
 - Run core tests

Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
---
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/AuthorizationFactory.java
M fe/src/main/java/org/apache/impala/authorization/BaseAuthorizationChecker.java
M fe/src/main/java/org/apache/impala/authorization/NoopAuthorizationFactory.java
A fe/src/main/java/org/apache/impala/authorization/TableMask.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationFactory.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationChecker.java
M 
fe/src/main/java/org/apache/impala/authorization/sentry/SentryAuthorizationFactory.java
M fe/src/test/java/org/apache/impala/authorization/AuthorizationStmtTest.java
M fe/src/test/java/org/apache/impala/common/FrontendTestBase.java
A 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking.test
M tests/authorization/test_ranger.py
M tests/common/impala_test_suite.py
23 files changed, 756 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/14894/7
--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 7
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 12
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 02:38:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5352/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 12
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 02:38:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 11: Code-Review+2

(1 comment)

LGTM. Thank Ji!

http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc@918
PS11, Line 918:   DCHECK_GT(valid_num_children, 0);
> This should be moved to before the if-clause since we used to check it in t
Sorry, that I miss-read this as DCHECK_GE. Then it's right here since 
"valid_num_children == 0" is covered by "valid_start_index < 0"



--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 11
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 02:37:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/14885/11/be/src/exprs/string-functions-ir.cc@918
PS11, Line 918:   DCHECK_GT(valid_num_children, 0);
This should be moved to before the if-clause since we used to check it in the 
if-clause. You can put it right after the for-loop.



--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 11
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 02:27:30 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-9236 Make llvm for aarch64 platform.

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14900 )

Change subject: IMPALA-9236 Make llvm for aarch64 platform.
..

IMPALA-9236 Make llvm for aarch64 platform.

Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104
Reviewed-on: http://gerrit.cloudera.org:8080/14900
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M source/llvm/build-3.3.sh
M source/llvm/build-source-tarball.sh
2 files changed, 4 insertions(+), 0 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14900
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104
Gerrit-Change-Number: 14900
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] IMPALA-9236 Make llvm for aarch64 platform.

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14900 )

Change subject: IMPALA-9236 Make llvm for aarch64 platform.
..


Patch Set 3: Verified+1 Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14900
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4a59e8c9c847b3c7eb05584230e1ecbfebcf0104
Gerrit-Change-Number: 14900
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Matthew Jacobs 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:48:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5297/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 11
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:37:10 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-9246 Add patch for crcutils to support aarch64

2019-12-16 Thread Anonymous Coward (Code Review)
Hello Tim Armstrong,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14901

to look at the new patch set (#4).

Change subject: IMPALA-9246 Add patch for crcutils to support aarch64
..

IMPALA-9246 Add patch for crcutils to support aarch64

This adds a patch for crcutils to make it works on
aarch64 platform.

Change-Id: Ia3c38563a549ab9c522e23a19e9951f5ab0ada37
---
M buildall.sh
A 
source/crcutil/crcutil-440ba7babeff77ffad992df3a10c767f184e946e-patches/0002-Build-crcutil-440ba7b-on-aarch64.patch
2 files changed, 119 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/01/14901/4
--
To view, visit http://gerrit.cloudera.org:8080/14901
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia3c38563a549ab9c522e23a19e9951f5ab0ada37
Gerrit-Change-Number: 14901
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14916 )

Change subject: IMPALA-9257: Last event id should be advanced if all events are 
skipped
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5296/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32
Gerrit-Change-Number: 14916
Gerrit-PatchSet: 1
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:28:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14538 )

Change subject: IMPALA-4400: aggregate runtime filters locally
..


Patch Set 16:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5295/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iabeeab5eec869ff2197250ad41c1eb5551704acc
Gerrit-Change-Number: 14538
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:24:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8891: Fix non-standard null handling in concat ws()

2019-12-16 Thread jichen (Code Review)
jichen has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/14885 )

Change subject: IMPALA-8891: Fix non-standard null handling in concat_ws()
..

IMPALA-8891: Fix non-standard null handling in concat_ws()

This patch fixes the non-standard null handling logic for
function 'concat_ws', while maintaining the original null
handling for function 'concat'

Existing statuses:
For function concat_ws, any null string element in array
argument 'strs' will result in null result, just like below:

select concat_ws('-','foo',null,'bar') as expr1;
+---+
| expr1 |
+---+
| NULL  |
+---+

New Statuses:
In this implementation, the function conforms to hive standard:
1.will join all the non-null string objects as the result
2.if all string objects are null, return empty string
3.if separator is null, return null
below is a example:
-
select concat_ws('-','foo',null,'bar') as expr1;
+--+
|  expr1   |
+--+
| foo-bar  |
+--+


Key changes:
* Reimplement function StringFunctions::ConcatWs by filtering the
  null value and only process the valid string values, based on
  original code structure.
* StringFunctions::Concat was also reimplemented, as it used to
  call ConcatWs but should keep the original NULL handling.

Testing:
* Ran exaustive tests.

Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
---
M be/src/exprs/expr-test.cc
M be/src/exprs/string-functions-ir.cc
2 files changed, 77 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/85/14885/11
--
To view, visit http://gerrit.cloudera.org:8080/14885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I64cd3bfbb952e431a0cf52a5835ac05d2513d29b
Gerrit-Change-Number: 14885
Gerrit-PatchSet: 11
Gerrit-Owner: jichen 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: jichen 


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..


Patch Set 13: Code-Review+2

carrying over +2


--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:15:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5351/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 01:15:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9257: Last event id should be advanced if all events are skipped

2019-12-16 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14916


Change subject: IMPALA-9257: Last event id should be advanced if all events are 
skipped
..

IMPALA-9257: Last event id should be advanced if all events are skipped

Events processor implements a filtering method which skips certain
unnecessary events (eg. events on blacklisted dbs and tables).
However, if the received batch has all the events which are filtered
out, it fails to update its lastSyncedEventId. This causes unnecessary
logs being printed in catalog and the same event batch being fetched
repeatedly.

Testing:
Modified existing test to compare event id after events on blacklisted
dbs and tables.

Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32
---
M 
fe/src/main/java/org/apache/impala/catalog/events/MetastoreEventsProcessor.java
M tests/custom_cluster/test_event_processing.py
2 files changed, 7 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/16/14916/1
--
To view, visit http://gerrit.cloudera.org:8080/14916
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7f94c1a8e8c221f504262d5591cda8c3a25c0c32
Gerrit-Change-Number: 14916
Gerrit-PatchSet: 1
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#16). ( 
http://gerrit.cloudera.org:8080/14538 )

Change subject: IMPALA-4400: aggregate runtime filters locally
..

IMPALA-4400: aggregate runtime filters locally

Move RuntimeFilterBank to QueryState(). Implement fine-grained
locking for each filter to mitigate any increased lock
contention from the change.

Make RuntimeFilterBank handle multiple producers of the
same filter, e.g. multiple instances of a partitioned
join. It computes the expected number of filters upfront
then sends the filter to the coordinator once all the
local instances have been merged together. The merging
can done in parallel locally to improve latency of
filter propagation.

Add Or() methods to MinMaxFilter and BloomFilter, since
we now need to merge those, not just the thrift versions.

Update coordinator filter routing to expect only one
instance of a filter from each producer backend and
to only send one instance to each consumer backend
(instead of sending one per fragment).

Update memory reservations and estimates to be lower
to account for sharing of filters between fragment
instances. mt_dop plans are modified to show these
shared and non-shared resources separately.

Enable waiting for runtime filters for kudu scanner
with mt_dop.

Made min/max filters const-correct.

Testing
* Added unit tests for Or() methods.
* Added some additional e2e test coverage for mt_dop queries
* Updated planner tests with new estimates and reservation.
* Ran a single node 3-impalad stress test with TPC-H kudu and
  TPC-DS parquet.
* Ran exhaustive tests.
* Ran core tests with ASAN.

Perf
* Did a single-node perf run on TPC-H with default settings. No perf change.
* Single-node perf run with mt_dop=8 showed significant speedups:

+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(30) | parquet / none / none | 10.14   | -7.29% | 5.05   | 
-11.68%|
+--+---+-++++

+--+--+---++-++---++---++-+-+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval|
+--+--+---++-++---++---++-+-+
| TPCH(30) | TPCH-Q7  | parquet / none / none | 38.87  | 38.44   |   +1.13% 
  |   7.17%   | * 10.92% * | 20|   +0.72%   | 0.72| 0.39|
| TPCH(30) | TPCH-Q1  | parquet / none / none | 4.28   | 4.26|   +0.50% 
  |   1.92%   |   1.09%| 20|   +0.03%   | 0.31| 1.01|
| TPCH(30) | TPCH-Q22 | parquet / none / none | 2.32   | 2.32|   +0.05% 
  |   2.01%   |   1.89%| 20|   -0.03%   | -0.36   | 0.08|
| TPCH(30) | TPCH-Q15 | parquet / none / none | 3.73   | 3.75|   -0.42% 
  |   0.84%   |   1.05%| 20|   -0.25%   | -0.77   | -1.40   |
| TPCH(30) | TPCH-Q13 | parquet / none / none | 9.80   | 9.83|   -0.38% 
  |   0.51%   |   0.80%| 20|   -0.32%   | -1.30   | -1.81   |
| TPCH(30) | TPCH-Q2  | parquet / none / none | 1.98   | 2.00|   -1.32% 
  |   1.74%   |   2.81%| 20|   -0.64%   | -1.71   | -1.79   |
| TPCH(30) | TPCH-Q6  | parquet / none / none | 1.22   | 1.25|   -2.14% 
  |   2.66%   |   4.15%| 20|   -0.96%   | -2.00   | -1.95   |
| TPCH(30) | TPCH-Q19 | parquet / none / none | 5.13   | 5.22|   -1.65% 
  |   1.20%   |   1.40%| 20|   -1.76%   | -3.34   | -4.02   |
| TPCH(30) | TPCH-Q16 | parquet / none / none | 2.46   | 2.56|   -4.13% 
  |   2.49%   |   1.99%| 20|   -4.31%   | -4.04   | -5.94   |
| TPCH(30) | TPCH-Q9  | parquet / none / none | 81.63  | 85.07   |   -4.05% 
  |   4.94%   |   3.06%| 20|   -5.46%   | -3.28   | -3.21   |
| TPCH(30) | TPCH-Q10 | parquet / none / none | 5.07   | 5.50| I -7.92% 
  |   0.96%   |   1.33%| 20| I -8.51%   | -5.27   | -22.14  |
| TPCH(30) | TPCH-Q21 | parquet / none / none | 24.00  | 26.24   | I -8.57% 
  |   0.46%   |   0.38%| 20| I -9.34%   | -5.27   | -67.47  |
| TPCH(30) | TPCH-Q18 | parquet / none / none | 8.66   | 9.50| I -8.86% 
  |   0.62%   |   0.44%| 20| I -9.75%   | -5.27   | -55.17  |
| TPCH(30) | TPCH-Q3  | parquet / none / none | 6.01   | 6.70| I 
-10.19%  |   1.01%   |   0.90%| 20| I -11.25%  | -5.27   | 
-35.76  |
| 

[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/5350/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 00:10:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9009: Core support for Ranger column masking

2019-12-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14894 )

Change subject: IMPALA-9009: Core support for Ranger column masking
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14894/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14894/6//COMMIT_MSG@32
PS6, Line 32:  - CreateView and AlterView statements on masked tables/views are 
not
:dealt with in this patch, since the rewritten AST may not be 
able to
:generate sql without syntax errors (e.g. missing alias).
Just realize that we don't need to rewrite statements for table masking for 
CreateView and AlterView. The table masking will happen when the view is used. 
Will update the patch for this.



--
To view, visit http://gerrit.cloudera.org:8080/14894
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4cad60e0e69ea573b7ecfc011b142c46ef52ed61
Gerrit-Change-Number: 14894
Gerrit-PatchSet: 6
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 17 Dec 2019 00:08:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Dec 2019 00:07:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5294/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 23:42:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5293/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 23:27:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Adar Dembo (Code Review)
Adar Dembo has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14915 )

Change subject: Fix handling of 'Connection' header in squeasel
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 23:26:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14764 )

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..


Patch Set 13:

Seems like an unrelated failure (Auth test), not sure why it failed since I had 
earlier ran exhaustive tests and core tests on ASAN build successfully. Rebased 
and testing again.


--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 23:12:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4192: Move static state from ExecNode into a PlanNode

2019-12-16 Thread Bikramjeet Vig (Code Review)
Hello Michael Ho, Daniel Becker, Tim Armstrong, Csaba Ringhofer, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14764

to look at the new patch set (#13).

Change subject: IMPALA-4192: Move static state from ExecNode into a PlanNode
..

IMPALA-4192: Move static state from ExecNode into a PlanNode

This patch adds a new class called PlanNode which contains a subset
of the static state  of their corresponding ExecNode, of which there
is one instance per fragment. ExecNode contains the runtime state
and there can be up to MT_DOP instances of it per fragment.

It also adds a similar class called AggregatorConfig which contains
static state initialized from the thrift aggregator struct and is
passed as an input to the Aggregator class's constructor.

Eventually all static state including codegened function pointers
would be moved to the PlanNodes.

Testing:
Ran exhaustive tests successfully.

Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
---
M be/src/exec/aggregation-node-base.cc
M be/src/exec/aggregation-node-base.h
M be/src/exec/aggregation-node.cc
M be/src/exec/aggregation-node.h
M be/src/exec/aggregator.cc
M be/src/exec/aggregator.h
M be/src/exec/analytic-eval-node.cc
M be/src/exec/analytic-eval-node.h
M be/src/exec/blocking-join-node.cc
M be/src/exec/blocking-join-node.h
M be/src/exec/cardinality-check-node.cc
M be/src/exec/cardinality-check-node.h
M be/src/exec/data-source-scan-node.cc
M be/src/exec/data-source-scan-node.h
M be/src/exec/empty-set-node.cc
M be/src/exec/empty-set-node.h
M be/src/exec/exchange-node.cc
M be/src/exec/exchange-node.h
M be/src/exec/exec-node.cc
M be/src/exec/exec-node.h
M be/src/exec/grouping-aggregator.cc
M be/src/exec/grouping-aggregator.h
M be/src/exec/hbase-scan-node.cc
M be/src/exec/hbase-scan-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scan-node-mt.cc
M be/src/exec/hdfs-scan-node-mt.h
M be/src/exec/hdfs-scan-node.cc
M be/src/exec/hdfs-scan-node.h
M be/src/exec/kudu-scan-node-base.cc
M be/src/exec/kudu-scan-node-base.h
M be/src/exec/kudu-scan-node-mt.cc
M be/src/exec/kudu-scan-node-mt.h
M be/src/exec/kudu-scan-node.cc
M be/src/exec/kudu-scan-node.h
M be/src/exec/nested-loop-join-node.cc
M be/src/exec/nested-loop-join-node.h
M be/src/exec/non-grouping-aggregator.cc
M be/src/exec/non-grouping-aggregator.h
M be/src/exec/partial-sort-node.cc
M be/src/exec/partial-sort-node.h
M be/src/exec/partitioned-hash-join-node.cc
M be/src/exec/partitioned-hash-join-node.h
M be/src/exec/scan-node.cc
M be/src/exec/scan-node.h
M be/src/exec/select-node.cc
M be/src/exec/select-node.h
M be/src/exec/singular-row-src-node.cc
M be/src/exec/singular-row-src-node.h
M be/src/exec/sort-node.cc
M be/src/exec/sort-node.h
M be/src/exec/streaming-aggregation-node.cc
M be/src/exec/streaming-aggregation-node.h
M be/src/exec/subplan-node.cc
M be/src/exec/subplan-node.h
M be/src/exec/topn-node.cc
M be/src/exec/topn-node.h
M be/src/exec/union-node.cc
M be/src/exec/union-node.h
M be/src/exec/unnest-node.cc
M be/src/exec/unnest-node.h
M be/src/runtime/fragment-instance-state.cc
M be/src/runtime/fragment-instance-state.h
64 files changed, 1,338 insertions(+), 700 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/64/14764/13
--
To view, visit http://gerrit.cloudera.org:8080/14764
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I69f1676bf67bac31fa5902511b3fcc269fd67472
Gerrit-Change-Number: 14764
Gerrit-PatchSet: 13
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] Fix handling of 'Connection' header in squeasel

2019-12-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14915


Change subject: Fix handling of 'Connection' header in squeasel
..

Fix handling of 'Connection' header in squeasel

This patch bumps Impala's version of squeasel to
030ccce87359d892e22fb368c5fc5b75d9a2a5f7
in order to pull in a fix for an issue where squeasel was improperly
closing connections when the 'Connection: Upgrade' header was included
in the request, which Apache Knox will include in certain situations.

See https://github.com/cloudera/squeasel/pull/16

Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
---
M be/src/thirdparty/squeasel/squeasel.c
1 file changed, 21 insertions(+), 10 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/14915/1
--
To view, visit http://gerrit.cloudera.org:8080/14915
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9b9c4f328f0be7800aab978e14b7f46c67293417
Gerrit-Change-Number: 14915
Gerrit-PatchSet: 1
Gerrit-Owner: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-9248: fix test krpc rpcz on older kernels

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14913 )

Change subject: IMPALA-9248: fix test_krpc_rpcz on older kernels
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5292/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14913
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563
Gerrit-Change-Number: 14913
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Comment-Date: Mon, 16 Dec 2019 22:06:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9248: fix test krpc rpcz on older kernels

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/14913


Change subject: IMPALA-9248: fix test_krpc_rpcz on older kernels
..

IMPALA-9248: fix test_krpc_rpcz on older kernels

Switched the test to check a different counter that
is present in the 2.6.32 kernel (see
https://kernel.googlesource.com/pub/scm/linux/kernel/git/vapier/lemote/+/linux-2.6.32-stable/include/linux/tcp.h)

Testing:
Ran the affected test on a CentOS 6 system.

Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563
---
M tests/webserver/test_web_pages.py
1 file changed, 5 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/13/14913/1
--
To view, visit http://gerrit.cloudera.org:8080/14913
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I40aece9d37622710a4ef3add2b1088bb3ac91563
Gerrit-Change-Number: 14913
Gerrit-PatchSet: 1
Gerrit-Owner: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails

2019-12-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14677 )

Change subject: IMPALA-9137: Blacklist node if a DataStreamService RPC to the 
node fails
..


Patch Set 12:

> Depends on the particular RPC. When we blacklist for a failed
 > Exec() rpc, it will never have been retried. For TransmitData(), it
 > will only be retried if the error that came back is "server too
 > busy"

How about connection / connectivity errors? If the connection is broken, do we 
make any attempt to retry connection establishment or do we just fail?

 > If auto-scaling takes into account blacklisting, I would probably
 > say that's the wrong approach and we should think about changing it
 > to only consider the statestore view of cluster membership.

Well if a node is blacklisted, no queries get scheduled on that node right? So 
functionally, the node isn't doing any useful work, so it might make sense for 
the auto-scaler to think it is "dead" and no longer part of the executor group. 
Perhaps a better approach would be to only consider a node as "dead" if it has 
been blacklisted multiple times. I think this is a larger discussion though, we 
haven't spent enough time thinking through node blacklisting + auto-scaler 
integration. I think we should, but probably as a separate item.

 > Definitely agree that we need to put some work into realistic fault
 > testing, and not just rely on our intuition about blacklisting/retry
 > policy.

Filed IMPALA-9253 for this.

 > I'm definitely interested in approaches to retries that don't
 > require marking statuses retriable (obviously I've had a lot of
 > concerns with that approach). I could see some problems with this
 > approach too, such as that we may have situations in the future
 > where we want to retry queries without doing any blacklisting (eg.
 > if we think there was stale metadata and replanning the query will
 > let it succeed)

Yeah, its possible in the future that we might want to retry queries even if a 
node hasn't been blacklisted, but I don't see a strong use case for that type 
of pattern right now. In the interest of not over-engineering the initial 
solution, I'm fine with just making retries blacklist / cluster membership 
driven.

 > or probably worse - that you could have a query
 > that fails due to a failed rpc and reports that the node should be
 > blacklisted, and then also hits another error that isn't retry-able
 > but we retry it anyways. Not sure how common a situation like that
 > would be, though.

Definitely possible, but a bit of an edge case. A query would have to fail for 
two separate reasons at approximately the same time: an error that triggers a 
blacklist, and an error that doesn't. Whenever the Coordinator gets an error, 
it cancels the rest of the query, so the non-blacklist error has to occur 
before the cancellation is processed. Its something we should fix, but I think 
we can defer out of the initial implementation. Filed IMPALA-9124 as a follow 
up.

The fix would probably be to ensure that all failed fragments trigger a 
blacklist or failed because they were cancelled.


--
To view, visit http://gerrit.cloudera.org:8080/14677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c
Gerrit-Change-Number: 14677
Gerrit-PatchSet: 12
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 16 Dec 2019 21:30:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14705 )

Change subject: Added support of Kudu DATE type to Impala
..


Patch Set 3:

One thing I think is missing here - adding support for DATE to our min-max 
runtime filters. I think its fine to leave that for a followup patch, though 
please file a JIRA for it, but you'll probably also need to make sure in this 
patch that we aren't trying to create DATE min-max filters, which would 
probably result in a crash. If you're not sure how to write a query that will 
generate a min-max filter in order to test this, feel free to reach out to me.

See the DECIMAL patch for how they temporarily avoided generated DECIMAL 
min-max filters: 
https://gerrit.cloudera.org/#/c/9368/10/fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java


--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 3
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 
Gerrit-Comment-Date: Mon, 16 Dec 2019 19:53:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4400: aggregate runtime filters locally

2019-12-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14538 )

Change subject: IMPALA-4400: aggregate runtime filters locally
..


Patch Set 15:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14538/15//COMMIT_MSG@18
PS15, Line 18: can done
typo


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc
File be/src/exec/kudu-scan-node-base.cc:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/exec/kudu-scan-node-base.cc@110
PS15, Line 110:   if (filter_ctxs_.size() > 0) WaitForRuntimeFilters();
Maybe outside the scope of this patch, but it looks like the hdfs scanners 
don't call this until GetNext(), which is nice since it means that all of the 
setup can happen before we wait. Might want to do the same thing for Kudu.


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@136
PS15, Line 136:   bool HasFragmentIdx(int fragment_idx);
const


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/coordinator-backend-state.h@140
PS15, Line 140:   bool HasFragmentIdx(const std::unordered_set& 
fragment_idxs);
const


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h
File be/src/runtime/query-state.h:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.h@161
PS15, Line 161:   RuntimeFilterBank* filter_bank() { return filter_bank_.get(); 
}
const


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc
File be/src/runtime/query-state.cc:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/query-state.cc@a679
PS15, Line 679:
Looks like maybe this was the only use of 'fragment_map_' anywhere and its 
unnecessary now?


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h
File be/src/runtime/runtime-filter-bank.h:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@68
PS15, Line 68: AllocateScratchBloomFilter()
Not your change, but this is wrong, since the sentence refers to both bloom and 
min-max filters. Maybe change it to say "AllocateScratch*Filter()"


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@79
PS15, Line 79: PublishGlobalFilte
 : /// ()
typo


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-filter-bank.h@184
PS15, Line 184: locked separately
typo


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h
File be/src/runtime/runtime-state.h:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/runtime/runtime-state.h@63
PS15, Line 63: /// A collection of items that are part of the global state of a 
query and shared across
Not your change, but this comment seems wrong/confusing to me. RuntimeState 
isn't shared across a query, its per-fragment instance, right?


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc
File be/src/scheduling/query-schedule.cc:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/scheduling/query-schedule.cc@246
PS15, Line 246:   for (const FInstanceExecParams& instance_params: 
instance_exec_params) {
Looks like you're relying on the fact that all instances for a particular host 
will be contiguous within instance_exec_params. Might want to document that 
assumption where instance_exec_params is declared, or even make it a map from 
host to finstances.


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc
File be/src/util/bloom-filter.cc:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/bloom-filter.cc@251
PS15, Line 251:   if (other.AlwaysFalse()) return;
Doesn't look like you're handling the always_true case, maybe add a DCHECK?


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h
File be/src/util/min-max-filter.h:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.h@74
PS15, Line 74:   virtual void SetAlwaysTrue() = 0;
protected?


http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc
File be/src/util/min-max-filter.cc:

http://gerrit.cloudera.org:8080/#/c/14538/15/be/src/util/min-max-filter.cc@324
PS15, Line 324: // Sets 'always_true_' to true and clears the values of 'min_', 
'max_', 'min_buffer_',
Seems pretty clear from the code, so probably unnecessary, or maybe leave it in 
the header.


http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto
File common/protobuf/data_stream_service.proto:

http://gerrit.cloudera.org:8080/#/c/14538/15/common/protobuf/data_stream_service.proto@136
PS15, Line 136:   // optional int32 dst_fragment_idx = 3;
Any reason not to just completely remove this and renumber? Its completely 

[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14898 )

Change subject: IMPALA-9244 Bump up boost to work on aarch64
..


Patch Set 3:

This built OK on all the supported platforms. Thanks for the contribution!


--
To view, visit http://gerrit.cloudera.org:8080/14898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24
Gerrit-Change-Number: 14898
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 19:16:51 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14898 )

Change subject: IMPALA-9244 Bump up boost to work on aarch64
..

IMPALA-9244 Bump up boost to work on aarch64

Boost support ARM64 platform since version 1.58.0
https://github.com/boostorg/context/blob/boost-1.58.0/build/Jamfile.v2#L224
and Kudu-84086fe uses 1.61.0, so let's bump up
boost to 1.61.0.

Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24
Reviewed-on: http://gerrit.cloudera.org:8080/14898
Reviewed-by: Tim Armstrong 
Tested-by: Tim Armstrong 
---
M buildall.sh
1 file changed, 8 insertions(+), 5 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/14898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24
Gerrit-Change-Number: 14898
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Tim Armstrong 


[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14898 )

Change subject: IMPALA-9244 Bump up boost to work on aarch64
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/14898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24
Gerrit-Change-Number: 14898
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 19:16:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] [DOCS] Clarification on HADOOP-15720 in impala known issues.xml

2019-12-16 Thread Alex Rodoni (Code Review)
Alex Rodoni has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/14906 )

Change subject: [DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml
..

[DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml

Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a
Reviewed-on: http://gerrit.cloudera.org:8080/14906
Tested-by: Impala Public Jenkins 
Reviewed-by: Robbie Zhang 
Reviewed-by: Joe McDonnell 
---
M docs/topics/impala_known_issues.xml
1 file changed, 7 insertions(+), 7 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Robbie Zhang: Looks good to me, but someone else must approve
  Joe McDonnell: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/14906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a
Gerrit-Change-Number: 14906
Gerrit-PatchSet: 2
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Robbie Zhang 


[Impala-ASF-CR] [DOCS] Clarification on HADOOP-15720 in impala known issues.xml

2019-12-16 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14906 )

Change subject: [DOCS] Clarification on HADOOP-15720 in impala_known_issues.xml
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14906
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I724085ee63005cb2ce68a4532e6cf315d4799e7a
Gerrit-Change-Number: 14906
Gerrit-PatchSet: 1
Gerrit-Owner: Alex Rodoni 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Robbie Zhang 
Gerrit-Comment-Date: Mon, 16 Dec 2019 18:53:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8547: get json object fails to get value for numeric key

2019-12-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14905 )

Change subject: IMPALA-8547: get_json_object fails to get value for numeric key
..


Patch Set 1:

(1 comment)

Thanks for the contribution Euguene!

A few comments:

It would be great if you could include some more information in the commit 
message. Take a look at 
https://cwiki.apache.org/confluence/display/IMPALA/Contributing+to+Impala for 
an example of an Impala commit message. The commit message should include 
information about how to reproduce the issue, how this patch fixes the issue, 
and how this patch was tested.

For the patch itself, the behavior here seems to be a bit ambiguous between SQL 
engines. While Hive (and Spark SQL) can execute the query you pasted in the 
JIRA, the Hive docs 
(https://cwiki.apache.org/confluence/display/Hive/LanguageManual+UDF) actually 
state that:

 Extracts json object from a json string based on json path specified, and 
returns json string of the extracted json object. It will return null if the 
input json string is invalid. NOTE: The json path can only have the characters 
[0-9a-z_], i.e., no upper-case or special characters. Also, the keys *cannot 
start with numbers.* This is due to restrictions on Hive column names.

I checked MySQL as well, and it actually cannot run the query from the JIRA 
description (JSON_EXTRACT is the equivalent in MySQL). It fails with the error: 
"Query Error: Error: ER_INVALID_JSON_PATH: Invalid JSON path expression. The 
error is around character position 3."

However, if you put quotations around the path identifier, it works:

 select JSON_EXTRACT('{"1": 5}', '$."1"');

https://stackoverflow.com/questions/49557144/how-to-extract-values-from-a-numeric-keyed-nested-json-field-in-mysql
 has more info.

Oddly enough, MS SQL Server and Oracle follows the MySQL behavior, but Postgres 
follows the Hive behavior.

In general, Impala tries to follow MySQL behavior. Given that MS SQL and Oracle 
follow the MySQL behavior as well, I think Impala should too.

http://gerrit.cloudera.org:8080/#/c/14905/1/be/src/util/string-util.cc
File be/src/util/string-util.cc:

http://gerrit.cloudera.org:8080/#/c/14905/1/be/src/util/string-util.cc@76
PS1, Line 76: FindEndOfIdentifier
need to update the docs of this method in string-util.h. The docs currently say 
that this method is to find C style identifiers. A C style identifier cannot 
start with a digit 
(https://docs.microsoft.com/en-us/cpp/c-language/c-identifiers?view=vs-2019).



--
To view, visit http://gerrit.cloudera.org:8080/14905
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7df037ccf2c79da0ba86a46df1dd28ab0e9a45f4
Gerrit-Change-Number: 14905
Gerrit-PatchSet: 1
Gerrit-Owner: Eugene Zimichev 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 16 Dec 2019 18:38:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9137: Blacklist node if a DataStreamService RPC to the node fails

2019-12-16 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14677 )

Change subject: IMPALA-9137: Blacklist node if a DataStreamService RPC to the 
node fails
..


Patch Set 12:

> Right, but that RPC has already been retried multiple times, right?
 > I should probably validate that for these specific error codes, the
 > RPCs are actually retried.

Depends on the particular RPC. When we blacklist for a failed Exec() rpc, it 
will never have been retried. For TransmitData(), it will only be retried if 
the error that came back is "server too busy"

 > So depending on an external auto-scaling policy, its possible that
 > blacklisting a node (even for 12 seconds) makes the executor group
 > unhealthy, and causes the auto-scaler to create a new executor
 > group (a potentially expensive operation).

If auto-scaling takes into account blacklisting, I would probably say that's 
the wrong approach and we should think about changing it to only consider the 
statestore view of cluster membership.

 > In general, I think we can be pretty test driven here. The best way
 > to figure out what additional errors codes to add would be to run
 > some network fault injection tests and see what error codes Impala
 > fails with. I'll open a follow up JIRA for that.

Definitely agree that we need to put some work into realistic fault testing, 
and not just rely on our intuition about blacklisting/retry policy.

 > I generally wanted to avoid doing this. I think it would be better
 > for only the Coordinator to make the ultimate decision about
 > whether a node is blacklisted. KrpcDataStreamSender just provides
 > enough information for the Coordinator to make the blacklisting
 > decision. Otherwise the logic for determining if a node should be
 > blacklisted will be spread across multiple classes, making it hard
 > to reason about blacklisting decisions.

I don't think it would be so bad to add a utility function like 
IsRetriableRpcError(kudu::Status) or whatever to keep the logic consistent, but 
I don't feel strongly about how exactly we arrange this, as long as we're 
getting the correct interaction between blacklisting and retrying.

 > However, I think you do bring up a good. We don't want a failed RPC
 > to trigger a query retry unless we are actually blacklisting the
 > target node. After thinking through it some more, maybe marking
 > queries as "retryable" isn't necessary at all. Instead, whenever a
 > query fails + blacklists a node, it is retried. So query retries
 > are blacklisting driven. This would mean abandoning the patch for
 > IMPALA-9138 (Classify certain errors as retryable), which I'm fine
 > with doing. Let me know if you feel differently.

I'm definitely interested in approaches to retries that don't require marking 
statuses retriable (obviously I've had a lot of concerns with that approach). I 
could see some problems with this approach too, such as that we may have 
situations in the future where we want to retry queries without doing any 
blacklisting (eg. if we think there was stale metadata and replanning the query 
will let it succeed), or probably worse - that you could have a query that 
fails due to a failed rpc and reports that the node should be blacklisted, and 
then also hits another error that isn't retry-able but we retry it anyways. Not 
sure how common a situation like that would be, though.

 > > but there's still the problem that this patch will only blacklist
 > > one node per query.
 >
 > Yeah, maybe this is too conservative. Assuming transparent query
 > retries support "x" number of retries, Impala should be able to
 > tolerate "x" faults across the duration of the query (and its
 > retries). Maybe that is okay, maybe not. Don't have strong feelings
 > here.

I think its probably fine to leave as is for now, until we have more confidence 
that blacklisting is really working how we want it to.


--
To view, visit http://gerrit.cloudera.org:8080/14677
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I733cca13847fde43c8ea2ae574d3ae04bd06419c
Gerrit-Change-Number: 14677
Gerrit-PatchSet: 12
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 16 Dec 2019 18:37:11 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-9244 Bump up boost to work on aarch64

2019-12-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14898 )

Change subject: IMPALA-9244 Bump up boost to work on aarch64
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/14898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia54aeb93f1a99a0f976fc63f2127c4f06b5ffb24
Gerrit-Change-Number: 14898
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 16 Dec 2019 17:04:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5291/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 16 Dec 2019 16:37:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2019-12-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14867 )

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14867/5/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14867/5/fe/src/main/java/org/apache/impala/service/Frontend.java@776
PS5, Line 776: if (authzFactory_.getAuthorizationConfig().isEnabled()
 : && !userHasDbLevelAccess(user, dbName)) {
 :   Privilege requiredPrivilege = Privilege.ANY;
 :
> nit, may be move this back into the if block since this is unnecessary othe
I kept them outside, as moving them inside + using them in an if would increase 
the nesting of the while loop further.



--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 4
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 16 Dec 2019 16:10:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

2019-12-16 Thread Csaba Ringhofer (Code Review)
Hello Quanlong Huang, Vihang Karajgaonkar, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/14867

to look at the new patch set (#6).

Change subject: IMPALA-9222: Speed up show tables/DBs if the user has access to 
parent db/server
..

IMPALA-9222: Speed up show tables/DBs if the user has access to parent db/server

Currently we always do the auth check for tables/dbs individually.
If the user has privileges higher in the hierarchy then it is not
necessary to do these checks as they will all succeed.

This change adds a higher level check before the individual checks.
The optimization is only enabled for Sentry, as Ranger has deny
policies, so server/db level access does not guarantee db/table level
access.

Testing:
- the existing auth related test coverage seems enough
- there are no tests for deny policies yet - adding them seems a bigger
  task so I created a follow up jira: IMPALA-9252

Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
1 file changed, 59 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/67/14867/6
--
To view, visit http://gerrit.cloudera.org:8080/14867
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic1f5c5d1cf447a9f1cec46c45272f250b8580826
Gerrit-Change-Number: 14867
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] Added support of Kudu DATE type to Impala

2019-12-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14705 )

Change subject: Added support of Kudu DATE type to Impala
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14705/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14705/3//COMMIT_MSG@7
PS3, Line 7: Added
Can you create an Impala Jira and add it to the commit message?


http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc
File be/src/exec/kudu-util.cc:

http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc@136
PS3, Line 136: dv->ToString()
This will always return empty string if dv is invalid (and ToDaysSinceEpoch() 
would be true otherwise). I would simply return "Invalid DateValue.".


http://gerrit.cloudera.org:8080/#/c/14705/3/be/src/exec/kudu-util.cc@193
PS3, Line 193: case TYPE_DATE:
The changes in this file all seem write related - was there any change on the 
read side? TimestampValue has some special logic, e.g. 
https://github.com/apache/impala/blob/f2f348c0f93208a0f34c33b6a4dc82f4d9d4b290/be/src/exec/kudu-scanner.cc#L336

DateValue's in memory representation is the some as in Kudu, so it is possible 
that less logic is needed than for Timestamps, but it is weird that nothing 
changed.



--
To view, visit http://gerrit.cloudera.org:8080/14705
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I91656749a58ac769b54c2a63bdd4f85c89520b32
Gerrit-Change-Number: 14705
Gerrit-PatchSet: 3
Gerrit-Owner: Volodymyr Verovkin 
Gerrit-Reviewer: Adar Dembo 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Volodymyr Verovkin 
Gerrit-Comment-Date: Mon, 16 Dec 2019 14:53:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

2019-12-16 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14846 )

Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show 
tables/databases'
..


Patch Set 7:

(3 comments)

The implementation looks good to me, but I am concerned about possible 
performance impact, see my comment in Frontend.java

http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG@15
PS7, Line 15: num_check_access_threads
I would prefer a name the contains "authorization" or "auth". The current name 
could mean many things.


http://gerrit.cloudera.org:8080/#/c/14846/7//COMMIT_MSG@18
PS7, Line 18: range
: of 1 to 32. The default value of 'num_check_access_threads' is 1.
nit: wrap at 72


http://gerrit.cloudera.org:8080/#/c/14846/7/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/14846/7/fe/src/main/java/org/apache/impala/service/Frontend.java@859
PS7, Line 859: submit
I am not familiar with ExecutorService, so I may be wrong here.

What happens if SHOW DATABASES is called by several users concurrently? Does 
MAX_CHECK_ACCESS_POOL_SIZE maximize the number of threads that can work in 
parallel, or the current thread (where doGetTableNames is called) can also do 
some work?

If no more than MAX_CHECK_ACCESS_POOL_SIZE threads can work at the same time 
and the default value 1 is used, then this change serializes parallel SHOW 
DATABASES calls which is a regression compared to the current state.

I can imagine that in certain cases bumping MAX_CHECK_ACCESS_POOL_SIZE can help 
a lot, but we should ensure that the default 1 does not cause significant 
regression.

related stackoverflow question: 
https://stackoverflow.com/questions/6581188/is-there-an-executorservice-that-uses-the-current-thread



--
To view, visit http://gerrit.cloudera.org:8080/14846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Gerrit-Change-Number: 14846
Gerrit-PatchSet: 7
Gerrit-Owner: Zhou Xu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zhou Xu 
Gerrit-Comment-Date: Mon, 16 Dec 2019 13:14:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

2019-12-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14846 )

Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show 
tables/databases'
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/5289/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/14846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Gerrit-Change-Number: 14846
Gerrit-PatchSet: 6
Gerrit-Owner: Zhou Xu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zhou Xu 
Gerrit-Comment-Date: Mon, 16 Dec 2019 08:17:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

2019-12-16 Thread Zhou Xu (Code Review)
Zhou Xu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/14846 )

Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show 
tables/databases'
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc
File be/src/common/global-flags.cc:

http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc@299
PS6, Line 299: "tables/databases. This configuration is applicable only 
when authorization is "
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/14846/6/be/src/common/global-flags.cc@300
PS6, Line 300: "enabled. A value of 1 disables multi-threaded execution for 
checking access. "
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/14846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Gerrit-Change-Number: 14846
Gerrit-PatchSet: 7
Gerrit-Owner: Zhou Xu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zhou Xu 
Gerrit-Comment-Date: Mon, 16 Dec 2019 08:02:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

2019-12-16 Thread Zhou Xu (Code Review)
Zhou Xu has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/14846 )

Change subject: IMPALA-9195: Using multithreaded execution to accelerate 'show 
tables/databases'
..

IMPALA-9195: Using multithreaded execution to accelerate 'show tables/databases'

If Sentry authorization is enabled, users with multi group-policies
will take time to get the result of 'show tables/databases'. It seems
that ResourceAuthorizationProvider.hasAccess performs bad for users
with complex group-policies, IMPALA-9242 will target to address this
problem.

This patch provides a config option 'num_check_access_threads' to accelerate
'show tables/databases' by using multithreading. This configuration is
applicable only when authorization is enabled. A value of 1 disables
multi-threaded execution for checking access. The value must be in the range
of 1 to 32. The default value of 'num_check_access_threads' is 1.

Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
---
M be/src/common/global-flags.cc
M be/src/util/backend-gflag-util.cc
M common/thrift/BackendGflags.thrift
M fe/src/main/java/org/apache/impala/service/BackendConfig.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
5 files changed, 115 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/14846/7
--
To view, visit http://gerrit.cloudera.org:8080/14846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I860e0d18afa0421665f8b3b1c5561d6bdacc5e96
Gerrit-Change-Number: 14846
Gerrit-PatchSet: 7
Gerrit-Owner: Zhou Xu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zhou Xu