[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15798 )

Change subject: IMPALA-9382: part 1: transposed profile prototype
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift
File common/thrift/RuntimeProfile.thrift:

http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift@249
PS16, Line 249: an averaged profile
  : // for the fragment is also included with averaged counter 
values.
> Does it increase the serialized size for V1 by a noticeable amount? That wo
It shouldn't change the serialized size cause of how the thrift encoding works 
- it just doesn't include unset fields. From the generated code

if (this->__isset.aggregated) {
  xfer += oprot->writeFieldBegin("aggregated", 
::apache::thrift::protocol::T_STRUCT, 13);
  xfer += this->aggregated.write(oprot);
  xfer += oprot->writeFieldEnd();
}



--
To view, visit http://gerrit.cloudera.org:8080/15798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a
Gerrit-Change-Number: 15798
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 04:54:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16379 )

Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77
Gerrit-Change-Number: 16379
Gerrit-PatchSet: 3
Gerrit-Owner: abeltian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 04:23:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16379 )

Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6371/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77
Gerrit-Change-Number: 16379
Gerrit-PatchSet: 3
Gerrit-Owner: abeltian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 04:23:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16379 )

Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
..


Patch Set 2: Code-Review+2

Slightly cleaned up the commit message but looks good. Thank you for 
contributing! I'd be interested to talk more about how to test against Alluxio 
if you have time.


--
To view, visit http://gerrit.cloudera.org:8080/16379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77
Gerrit-Change-Number: 16379
Gerrit-PatchSet: 2
Gerrit-Owner: abeltian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 04:22:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10087: IMPALA-6050 causes alluxio not to be supported

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#2) to the change originally 
created by abeltian. ( http://gerrit.cloudera.org:8080/16379 )

Change subject: IMPALA-10087: IMPALA-6050 causes alluxio not to be supported
..

IMPALA-10087: IMPALA-6050 causes alluxio not to be supported

This change adds file type support for alluxio.
Alluxio URLs have a different prefix
such as:alluxio://zk@zk-1:2181,zk-2:2181,zk-3:2181/path/

Testing:
Add unit test for alluxio file system type checks.

Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77
---
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M fe/src/test/java/org/apache/impala/common/FileSystemUtilTest.java
2 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/79/16379/2
--
To view, visit http://gerrit.cloudera.org:8080/16379
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id92ec9cb0ee241a039fe4a96e1bc2ab3eaaf8f77
Gerrit-Change-Number: 16379
Gerrit-PatchSet: 2
Gerrit-Owner: abeltian 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 25:

That's great news! It looks like some of the Iceberg-related tests failed - 
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11883/  
https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/11884/ 
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/3067/

But the good news is that it loaded the data successfully.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Tue, 01 Sep 2020 04:09:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12:

Filed IMPALA-10119 for the flaky test.


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:59:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 13:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6370/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 13
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:43:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 13: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 13
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:43:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12:

Looks unrelated, will rerun and file a JIRA


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:42:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12:

> Patch Set 12: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/

Flaky test?

shell.test_shell_interactive.TestImpalaShellInteractive.test_history_does_not_duplicate_on_interrupt[table_format_and_file_extension:
 ('textfile', '.txt') | protocol: hs2] (from pytest)

E   TIMEOUT: Timeout exceeded.
E   
E   version: 3.3
E   command: 
/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell
E   args: 
['/home/ubuntu/Impala/shell/build/impala-shell-4.0.0-SNAPSHOT/impala-shell', 
'--protocol=hs2', '-ilocalhost:21050']
E   searcher: 
E   buffer (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
default> '
E   before (last 100 chars): ' default> select 2;\r\n^C\r\n[localhost:21050] 
default> '
E   after: 
E   match: None
E   match_index: None
E   exitstatus: None
E   flag_eof: False
E   pid: 12993
E   child_fd: 24
E   closed: False
E   timeout: 30
E   delimiter: 
E   logfile: None
E   logfile_read: None
E   logfile_send: None
E   maxread: 2000
E   ignorecase: False
E   searchwindowsize: None
E   delaybeforesend: 0.05
E   delayafterclose: 0.1
E   delayafterterminate: 0.1


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 03:07:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Aman Sinha (Code Review)
Aman Sinha has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-12-01 00:00:00'
> Don't we still need to keep the date_col = cast(timestamp_col as date) pred
Good point.  All the use cases I have seen so far were ones where date_col was 
derived from the timestamp column.  Yeah, for your example, we need to keep the 
cast predicate if the constant is a range predicate.  I  think the code change 
isn't much but I need to think about how to create a test data set for this.


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4
PS9, Line 4: functional_parquet
> We generally don't include database names in the test files, since the infr
I would be ok with running with other data sets but I had some struggles in 
loading the alltypes_date_partition table and had offline discussion with 
Shant.  For Text format loading, the following error occurred since it went 
through HIve load process rather than Impala:

The load-functional-planner-core-hive-generated-text-none-none.sql.log had the 
following error:
   "Caused by: org.apache.hadoop.hive.ql.parse.SemanticException: Dynamic 
partition strict mode requires at least one static partition column. To turn 
this off set hive.exec.dynamic.partition.mode=nonstrict"

Setting the partition.mode to nonstrict got past that but ran into a default 
limit of the # dynamic partitions:
"The maximum number of dynamic partitions is controlled by 
hive.exec.max.dynamic.partitions and hive.exec.max.dynamic.partitions.pernode. 
Maximum was set to 100 partitions per node, number of dynamic partitions on 
this node: 101"

I could bump this up too .. but the Tez job does take much longer to 
execute..so I wasn't sure if it is worthwhile.

I could move this to TestQueriesParquetTables unless you have other suggestions.



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 02:20:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java
File fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java:

http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@55
PS9, Line 55:* predicates.
Mention how 'candidates' is used?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@61
PS9, Line 61: (E
nit: these parens prob aren't needed, right?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@66
PS9, Line 66:  !(bp.getOp() == BinaryPredicate.Operator.EQ)
can't this be !=?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@128
PS9, Line 128: Map.Entry
Can't this be Map.Entry to keep it type-safe and avoid cast?


http://gerrit.cloudera.org:8080/#/c/16346/9/fe/src/main/java/org/apache/impala/analysis/ConstantPredicateHandler.java@132
PS9, Line 132: Map.Entry
Map.Entry>?


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@419
PS9, Line 419:predicates: timestamp_col <= TIMESTAMP '2010-12-01 00:00:00', 
timestamp_col >= TIMESTAMP '2009-12-01 00:00:00'
Don't we still need to keep the date_col = cast(timestamp_col as date) 
predicate for this to be correct in cases where this isn't guaranteed to be 
true in the underlying data set?

E.g. one counter-example would be

  date_col timestamp_col
  2009-12-01   2009-12-2 00:00:00

I.e. I think we need to keep the equality predicate around.


http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test
File 
testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/9/testdata/workloads/functional-query/queries/QueryTest/range-constant-propagation.test@4
PS9, Line 4: functional_parquet
We generally don't include database names in the test files, since the infra 
should switch to the appropriate functional database.

We can move it to TestQueriesParquetTables if we only want it to run on the 
parquet data set (not kudu, etc).



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 01:17:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..

IMPALA-10030: Remove unnecessary jar dependencies

Remove the dependency on hadoop-hdfs, this jar file contains the core
code for implementing HDFS, and thus pulls in a bunch of unnecessary
transitive dependencies. Impala currently only requires this jar for
some configuration key names. Most of these configuration key names have
been moved to the appropriate HDFS client jars, and some others are
deprecated altogether. Removing this jar required making a few code
changes to move the location of the referenced configuration keys.

Removes all transitive Kafka dependencies from the Apache Ranger
dependency. Previously, Impala only excluded Kafka jars with binary
version kafka_2.11, however, it seems the Ranger recently upgraded the
dependency version to kafka_2.12. Now all Kafka dependencies are
excluded, regardless of artifact name.

Removes all transitive dependencies from the Apache Ozone dependency.
Impala has a dependency on the Ozone client shaded-jar, which already
includes all required transitive dependencies. For some reason, Ozone
still pulls in some transitive dependencies even though they are not
needed.

Made some other minor cleanup / improvements in the fe/pom.xml file.

This saves about 70 MB of space in the Docker images.

Testing:
* Ran exhaustive tests
* Ran on-prem cluster E2E tests

Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Reviewed-on: http://gerrit.cloudera.org:8080/16311
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/pom.xml
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java
M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
5 files changed, 51 insertions(+), 102 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 6
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..


Patch Set 5: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 01:15:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 00:54:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 01 Sep 2020 00:09:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..

IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

In HIVE-19064 the class of GenericHiveLexer was introduced as an
intermediate class between the classes of HiveLexer and Lexer. In order
for ToSqlUtils.java to be compiled once we bump up CDP_BUILD_NUMBER that
includes this change on the Hive side, this patch updates
shaded-deps/hive-exec/pom.xml to include the jar of GenericHiveLexer so
that Impala could be successfully built.

Testing:
 - Verified that Impala could compile in a local development
   environment after applying this patch.

Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Reviewed-on: http://gerrit.cloudera.org:8080/16390
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M shaded-deps/hive-exec/pom.xml
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16385 )

Change subject: IMPALA-9792: Implement splitting kudu scan ranges for greater 
parallelism
..


Patch Set 1:

(4 comments)

This seems I think basically OK. I'm on the fence about whether we should do 
some additional cluster testing, but leaning towards no because the complexity 
is all in the Kudu layer and I don't think we'd learn much from testing at 
small scale.

http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@24
PS1, Line 24: Testing
> Did you do any performance testing to gauge the impact (good or bad)?
Yeah it'd be good to do some sanity checks to confirm that it gives the speedup 
expected. E.g. TPC-H or similar. Maybe just a single query would be fine, e.g. 
TPC-H Q1.


http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@25
PS1, Line 25: - Added e2e tests
Do we have other any other tests that are going to exercise this code path, 
e.g. kudu e2e tests with multithreading.


http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py
File tests/query_test/test_kudu.py:

http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py@1442
PS1, Line 1442: union
union all would be a little more efficient, no?


http://gerrit.cloudera.org:8080/#/c/16385/1/tests/query_test/test_kudu.py@1468
PS1, Line 1468: assert regular_num_inst < 
with_mt_dop_and_disabled_range_len_num_inst < \
  :with_mt_dop_num_inst < 
with_mt_dop_and_low_range_len_num_inst
I don't think the < operator works this way - you're going to be comparing a 
bool with an in. Probably best to have each inequality as a separate assert.



--
To view, visit http://gerrit.cloudera.org:8080/16385
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8
Gerrit-Change-Number: 16385
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 23:44:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses

2020-08-31 Thread Fredy Wijaya (Code Review)
Fredy Wijaya has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16389 )

Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH 
clauses
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16389/2/shell/impala_shell.py
File shell/impala_shell.py:

http://gerrit.cloudera.org:8080/#/c/16389/2/shell/impala_shell.py@1280
PS2, Line 1280: if self.DML_REGEX.match(query_type.lower()):
looks like there were failed tests in the dry-run

nit: this code can be simplified like below.

is_dml = self.DML_REGEX.match(query_type.lower())
return self._execute_stmt(query, is_dml=is_dml, print_web_link=True)



--
To view, visit http://gerrit.cloudera.org:8080/16389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Gerrit-Change-Number: 16389
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 22:06:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10064: Support constant propagation for eligible range predicates

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16346 )

Change subject: IMPALA-10064: Support constant propagation for eligible range 
predicates
..


Patch Set 9: Code-Review+1

(1 comment)

LGTM nice little fix.

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test:

http://gerrit.cloudera.org:8080/#/c/16346/7/testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test@461
PS7, Line 461: timestamp_col <= '2010-12-01';
> Done
Done



--
To view, visit http://gerrit.cloudera.org:8080/16346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I811a1f8d605c27c7704d7fc759a91510c6db3c2b
Gerrit-Change-Number: 16346
Gerrit-PatchSet: 9
Gerrit-Owner: Aman Sinha 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 21:39:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16320 )

Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images
..


Patch Set 4:

I had thought about doing that split in the past - it seems like it would be 
useful. I don't see any obvious issues with doing it aside from tests making 
assumptions about scheduling.


-- 
To view, visit http://gerrit.cloudera.org:8080/16320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd
Gerrit-Change-Number: 16320
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 21:02:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10016: Split jars for Impala exec and coord Docker images

2020-08-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16320 )

Change subject: IMPALA-10016: Split jars for Impala exec and coord Docker images
..


Patch Set 4:

> (1 comment)
 >
 > The code changes basically look good. Let me know how testing goes,
 > I can approve it then.

Is there a change that we could make to the dockerized tests that would help us 
test the coordinator-only and executor-only images? At the moment, the 
dockerized tests use the impalad_coord_exec docker image, and that has been 
fine for coverage because impalad_coord_exec, impalad_executor, and 
impalad_coordinator have been so similar. I wonder how much work it would be to 
migrate to using one impalad_coordinator and three impalad_executor nodes (or 
one impalad_coord_exec and two impalad_executor nodes). Would this be a useful 
direction?


--
To view, visit http://gerrit.cloudera.org:8080/16320
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I899859a38d8ccab890de889a49ef132a89289dfd
Gerrit-Change-Number: 16320
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 20:10:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6369/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:57:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 5
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:57:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9382: part 1: transposed profile prototype

2020-08-31 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/15798 )

Change subject: IMPALA-9382: part 1: transposed profile prototype
..


Patch Set 16:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift
File common/thrift/RuntimeProfile.thrift:

http://gerrit.cloudera.org:8080/#/c/15798/16/common/thrift/RuntimeProfile.thrift@249
PS16, Line 249: an averaged profile
  : // for the fragment is also included with averaged counter 
values.
> It does. It will make the deserialized objects larger because of the extra
Does it increase the serialized size for V1 by a noticeable amount? That would 
be my main concern, since that corresponds to disk usage for the profile log.



--
To view, visit http://gerrit.cloudera.org:8080/15798
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0838c6a0872f57c696267ff4e92d29c08748eb7a
Gerrit-Change-Number: 15798
Gerrit-PatchSet: 16
Gerrit-Owner: Tim Armstrong 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:52:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..


Patch Set 4: Code-Review+2

Thanks for the update, appreciate the due diligence on it!


--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:42:06 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6368/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 12: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 12
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 11: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:39:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7054/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:28:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7053/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:17:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 11:

(10 comments)

Yeah I guess some of those comments in the tests came from different patches, 
all cleaned up now.

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98
PS10, Line 98: adjustNumD
> Maybe renamed as getNumDistinctValuesAdjusted().
made it adjustNumDistinctValues, since this is a private method want to avoid 
the get/set verbs as not to conflate with standard conventions for public 
methods.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191
PS10, Line 191:
> nit: seems like a move of the method in the module.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211
PS10, Line 211: verifySelectCol("nullrows", "null_str",
> This comment can be removed.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182
PS10, Line 182: // NDV(blanks) = 1, add 1 for nulls
  : // Bug: See IMPALA-7310, IMPALA-8094
  : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 
2);
> Seems like these lines can be removed.
Actually no this is a different issue, just adjusted the references.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132
PS10, Line 132:  group_str h
> This comment is not accurate.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136
PS10, Line 136: null_str is al
> Seems like the reference to c is not right here.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138
PS10, Line 138: i
> same here.
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: (g
> same
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: i
> same here
Done


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182
PS10, Line 182: = 1
> Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26.
Done



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 19:11:42 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Hello Aman Sinha, Qifan Chen, David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16349

to look at the new patch set (#11).

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..

IMPALA-7310: Partial fix for NDV cardinality with NULLs.

This fix just handles the case where a column's cardinality is zero
however it's nullable and we have null stats to indicate there are null
values, therefore we adjust the cardinality from 0 to 1.

The cardinality of zero was especially problematic when calculating
cardinalities for multiple predicates with multiplication. The 0 would
propagate up the plan tree and result in poor plan choices such as
always using broadcast joins where shuffle would've been more optimal.

Testing:
  * 26 Node TPC-DS 30TB run had better plans for Q4 and Q11
- Q4 172s -> 80s
- Q11 103s -> 77s
  * CardinalityTest
  * TpcdsPlannerTest

Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
---
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
M fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test
6 files changed, 795 insertions(+), 784 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/49/16349/11
--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 11
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 1:

This seems like a safe change and the reasoning makes sense.


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6367/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:37 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16390


Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..

IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

In HIVE-19064 the class of GenericHiveLexer was introduced as an
intermediate class between the classes of HiveLexer and Lexer. In order
for ToSqlUtils.java to be compiled once we bump up CDP_BUILD_NUMBER that
includes this change on the Hive side, this patch updates
shaded-deps/hive-exec/pom.xml to include the jar of GenericHiveLexer so
that Impala could be successfully built.

Testing:
 - Verified that Impala could compile in a local development
   environment after applying this patch.

Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
---
M shaded-deps/hive-exec/pom.xml
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/90/16390/1
--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for GenericHiveLexer

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16390 )

Change subject: IMPALA-10118: Update shaded-deps/hive-exec/pom.xml for 
GenericHiveLexer
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I27db1cb8de36dd86bae08b7177ae3f1c156d73bc
Gerrit-Change-Number: 16390
Gerrit-PatchSet: 1
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:56:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16350 )

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 4
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:34:05 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10099: Push down DISTINCT in Set operations

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16350 )

Change subject: IMPALA-10099: Push down DISTINCT in Set operations
..

IMPALA-10099: Push down DISTINCT in Set operations

INTERSECT/EXCEPT are not duplicate preserving operations. The distinct
aggregations can happen in each operand, the leftmost operand only, or
after all the operands in a separate aggregation step. Except for a
couple special cases we would use the last strategy most often.

This change pushes the distinct aggregation down to the leftmost operand
in cases where there are no analytic functions, or when a distinct or
grouping operation already eliminates duplicates.

In general DISTINCT placement such as in this case should be done
throughout the entire plan tree in a cost based manner as described in
IMPALA-5260

Testing:
 * TpcdsPlannerTest
 * PlannerTest
 * TPC-DS 30TB Perf run for any affected queries
   - Q14-1 180s -> 150s
   - Q14-2 109s -> 90s
   - Q8 no significant change
 * SetOperation Planner Tests
 * Analyzer tests
 * Tpcds Functional Workload

Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Reviewed-on: http://gerrit.cloudera.org:8080/16350
Tested-by: Impala Public Jenkins 
Reviewed-by: Tim Armstrong 
---
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M testdata/workloads/functional-planner/queries/PlannerTest/empty.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/setoperation-rewrite.test
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test
6 files changed, 2,049 insertions(+), 1,806 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Tim Armstrong: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/16350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia248f1595df2ab48fbe70c778c7c32bde5c518a5
Gerrit-Change-Number: 16350
Gerrit-PatchSet: 5
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103
PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null 
values.
> Yes the intent of this patch per earlier notes is to only address the =0 ca
Done



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:25:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 10:

I can +2 once you've addressed Qifan's comments.


--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 18:02:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Shant Hovsepian (Code Review)
Shant Hovsepian has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103
PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null 
values.
> When the numDistinctValues > 0, such adjustment is not performed. I wonder
Yes the intent of this patch per earlier notes is to only address the =0 cases 
as the the general fix is a bit involved at the moment.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 17:28:24 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Sahil Takiar (Code Review)
Sahil Takiar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16311 )

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..


Patch Set 4:

I ran exhaustive tests + on-prem E2E tests (L0s). I haven't run any K8s tests, 
but I can.

I think the hadoop-hdfs dependency change should be covered by the Impala 
exhaustive tests + L0s.

The Ranger dependency change is actually already present internally, so I don't 
think this actually does anything.

I confirmed with the Ozone team that this change is safe.


--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 16:21:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 25: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6366/


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 16:20:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10030: Remove unnecessary jar dependencies

2020-08-31 Thread Sahil Takiar (Code Review)
Hello Tim Armstrong, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16311

to look at the new patch set (#4).

Change subject: IMPALA-10030: Remove unnecessary jar dependencies
..

IMPALA-10030: Remove unnecessary jar dependencies

Remove the dependency on hadoop-hdfs, this jar file contains the core
code for implementing HDFS, and thus pulls in a bunch of unnecessary
transitive dependencies. Impala currently only requires this jar for
some configuration key names. Most of these configuration key names have
been moved to the appropriate HDFS client jars, and some others are
deprecated altogether. Removing this jar required making a few code
changes to move the location of the referenced configuration keys.

Removes all transitive Kafka dependencies from the Apache Ranger
dependency. Previously, Impala only excluded Kafka jars with binary
version kafka_2.11, however, it seems the Ranger recently upgraded the
dependency version to kafka_2.12. Now all Kafka dependencies are
excluded, regardless of artifact name.

Removes all transitive dependencies from the Apache Ozone dependency.
Impala has a dependency on the Ozone client shaded-jar, which already
includes all required transitive dependencies. For some reason, Ozone
still pulls in some transitive dependencies even though they are not
needed.

Made some other minor cleanup / improvements in the fe/pom.xml file.

This saves about 70 MB of space in the Docker images.

Testing:
* Ran exhaustive tests
* Ran on-prem cluster E2E tests

Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
---
M fe/pom.xml
M fe/src/main/java/org/apache/impala/service/JniFrontend.java
M fe/src/main/java/org/apache/impala/util/FsPermissionChecker.java
M fe/src/main/java/org/apache/impala/util/HdfsCachingUtil.java
M fe/src/test/java/org/apache/impala/service/JniFrontendTest.java
5 files changed, 51 insertions(+), 102 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/16311/4
--
To view, visit http://gerrit.cloudera.org:8080/16311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iadbb6142466f73f067dd7cf9d401ff81145c74cc
Gerrit-Change-Number: 16311
Gerrit-PatchSet: 4
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16389 )

Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH 
clauses
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6365/


--
To view, visit http://gerrit.cloudera.org:8080/16389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Gerrit-Change-Number: 16389
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 15:29:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5022 part 1/2: Outer join simplification

2020-08-31 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16266 )

Change subject: IMPALA-5022 part 1/2: Outer join simplification
..


Patch Set 16:

(5 comments)

Looks good to me!

http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG@9
PS16, Line 9: As a general rule, an outer join can be converted to an inner 
join if
: there is a condition on the inner table that filters out 
non‑matching
: rows. In a left outer join, the right table is the inner table, 
while
: it is the left table in a right outer join. In a full outer join, 
both
: tables are inner tables. Conditions that are FALSE for nulls are
: referred to as null filtering conditions, and these are the 
conditions
: that enable the outer‑to‑inner join conversion to be made.
Maybe reworded as

"Outer joins in SQL can return rows with certain columns filled with NULLs when 
a match can not be found. However, such rows can be rejected by null-rejecting 
predicates. The conditions in a null-rejecting predicate that are always 
evaluated to FALSE for NULLs are referred to as null-filtering conditions.

In general, an outer join can be converted to an inner join if
there exist null-filtering conditions on the inner tables. In a left outer 
join, the right table is the inner table, while
in a right outer join it is the left table. In a full outer join, both tables 
are inner tables."


http://gerrit.cloudera.org:8080/#/c/16266/16//COMMIT_MSG@50
PS16, Line 50:
I think we need to add a high-level description of what work is done in this 
commit. And also what will be the part 2 work.


http://gerrit.cloudera.org:8080/#/c/16266/13/fe/src/main/java/org/apache/impala/analysis/Analyzer.java
File fe/src/main/java/org/apache/impala/analysis/Analyzer.java:

http://gerrit.cloudera.org:8080/#/c/16266/13/fe/src/main/java/org/apache/impala/analysis/Analyzer.java@3277
PS13, Line 3277:*/
   :   private boolean isNullableConjunct(Expr e, List 
tupleIds) {
   : // A clause like "t1.v1 IS NOT NULL OR t2.v2 IS NOT NULL" 
and t1 in 'tupleIds' does
   : // not prove that t1.v1 can't be NULL, because when t2.v2 
IS NOT NULL, t1.v1 can be
   : // null. But a clause like "t1.v1 IS NOT NULL OR t1.v2 IS 
NOT NULL" proves that the
   : // t1 row as a whole can't be all-NULL.
   : Lis
> I changed to use the set retainAll method, but we should collect all of the
OK. It sounds like test in one shot for t1.id>10 and t2.id<10 or t2.id>50 or 
t2.name='a' will not work.


http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
File fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java:

http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@321
PS12, Line 321: Condition
nit. "Conditional"


http://gerrit.cloudera.org:8080/#/c/16266/12/fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java@327
PS12, Line 327: f
> OK, I see. I changed this as the doc. But I think the 'case' is not Functio
Done



--
To view, visit http://gerrit.cloudera.org:8080/16266
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaa7804033fac68e93f33c387dc68ef67f803e93e
Gerrit-Change-Number: 16266
Gerrit-PatchSet: 16
Gerrit-Owner: Xianqing He 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Xianqing He 
Gerrit-Comment-Date: Mon, 31 Aug 2020 15:13:09 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7310: Partial fix for NDV cardinality with NULLs.

2020-08-31 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16349 )

Change subject: IMPALA-7310: Partial fix for NDV cardinality with NULLs.
..


Patch Set 10:

(14 comments)

Looks good!

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@98
PS10, Line 98: adjustNdv(
Maybe renamed as getNumDistinctValuesAdjusted().


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@103
PS10, Line 103: // Adjust an ndv of zero to 1 if stats indicate there are null 
values.
When the numDistinctValues > 0, such adjustment is not performed. I wonder if 
the adjustment is unconditional, it will hurt anything.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java
File fe/src/main/java/org/apache/impala/catalog/ColumnStats.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/main/java/org/apache/impala/catalog/ColumnStats.java@191
PS10, Line 191: tNumDistinctValues() { return numDistinctValues_; }
nit: seems like a move of the method in the module.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprCardinalityTest.java@211
PS10, Line 211: // Bug: NDV should be 1 to include nulls
This comment can be removed.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@176
PS10, Line 176: a
id


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@178
PS10, Line 178: f
some_nulls


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@180
PS10, Line 180: c
null_str


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/analysis/ExprNdvTest.java@182
PS10, Line 182: // NDV(b) = 1, add 1 for nulls
  : // Bug: See IMPALA-7310, IMPALA-8094
  : //verifyNdvStmt("SELECT blanks FROM functional.nullrows", 
2);
Seems like these lines can be removed.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
File fe/src/test/java/org/apache/impala/planner/CardinalityTest.java:

http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@132
PS10, Line 132:  f has NDV=3
This comment is not accurate.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@136
PS10, Line 136: c is all nulls
Seems like the reference to c is not right here.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@138
PS10, Line 138: a
same here.


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: a
same here


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@140
PS10, Line 140: f)
same


http://gerrit.cloudera.org:8080/#/c/16349/10/fe/src/test/java/org/apache/impala/planner/CardinalityTest.java@182
PS10, Line 182: = 1
Maybe as // NDV(id) = 26, ndv(null_str) = 1, NDV(id)*ndv(null_str) = 26.



--
To view, visit http://gerrit.cloudera.org:8080/16349
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec967053b4991f8c67cde62adf003cbd3f429032
Gerrit-Change-Number: 16349
Gerrit-PatchSet: 10
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 14:28:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9792: Implement splitting kudu scan ranges for greater parallelism

2020-08-31 Thread Grant Henke (Code Review)
Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16385 )

Change subject: IMPALA-9792: Implement splitting kudu scan ranges for greater 
parallelism
..


Patch Set 1:

(4 comments)

Awesome! I am super interested to see the performance impact of this change.

http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@15
PS1, Line 15: TARGETED_KUDU_SCAN_RANGE_LENGTH
nit can you add the default chosen to the commit message here.


http://gerrit.cloudera.org:8080/#/c/16385/1//COMMIT_MSG@24
PS1, Line 24: Testing
Did you do any performance testing to gauge the impact (good or bad)?


http://gerrit.cloudera.org:8080/#/c/16385/1/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/16385/1/be/src/service/query-options.h@a50
PS1, Line 50:
Was this change an accident?


http://gerrit.cloudera.org:8080/#/c/16385/1/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/16385/1/common/thrift/ImpalaService.thrift@574
PS1, Line 574: mt_dop >= 2
nit: "mt_dop > 1" to simplify and to match the commit message.



--
To view, visit http://gerrit.cloudera.org:8080/16385
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia02fd94cc1d13c61bc6cb0765dd2cbe90e9a5ce8
Gerrit-Change-Number: 16385
Gerrit-PatchSet: 1
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Grant Henke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 31 Aug 2020 13:31:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16351 )

Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Gerrit-Change-Number: 16351
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 31 Aug 2020 13:28:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16351 )

Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
..

IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

A query will come into the FINISHED state when some rows are available,
even when some fragment instances are still executing. When a retryable
query comes into the FINISHED state and the client hasn't fetched any
results, we are still able to retry it for any retryable failures. This
patch fixes a DCHECK when retrying a FINISHED state query.

Tests:
 - Add a test in test_query_retries.py for retrying a query in FINISHED
   state.

Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Reviewed-on: http://gerrit.cloudera.org:8080/16351
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M be/src/runtime/query-driver.cc
M tests/custom_cluster/test_query_retries.py
2 files changed, 25 insertions(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Gerrit-Change-Number: 16351
Gerrit-PatchSet: 5
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 


[Impala-ASF-CR] IMPALA-10115: Impala should check file schema as well to check full ACIDv2 files

2020-08-31 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16383 )

Change subject: IMPALA-10115: Impala should check file schema as well to check 
full ACIDv2 files
..


Patch Set 1: Code-Review+2

(1 comment)

The patch seems good to me, my only concern is about losing test coverage in 
the future.

http://gerrit.cloudera.org:8080/#/c/16383/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16383/1//COMMIT_MSG@20
PS1, Line 20: * tested manually on a file that has ACIDv2 schema, but
:   'hive.acid.version' is missing
I would prefer to have an automatic test with a specific file, as Hive may set 
"hive.acid.version" during query-based compaction in the future, but should 
still be able to handle files written by older versions.



--
To view, visit http://gerrit.cloudera.org:8080/16383
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I52642c1755599efd28fa2c90f13396cfe0f5fa14
Gerrit-Change-Number: 16383
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 12:01:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 25:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7052/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 11:30:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 25:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6366/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 25
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 11:15:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10108: Implement ds kll stringify function

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16370 )

Change subject: IMPALA-10108: Implement ds_kll_stringify function
..


Patch Set 7: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6362/


--
To view, visit http://gerrit.cloudera.org:8080/16370
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I97f654a4838bf91e3e0bed6a00d78b2c7aa96f75
Gerrit-Change-Number: 16370
Gerrit-PatchSet: 7
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 11:14:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#25). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..

IMPALA-9741: Support querying Iceberg table by impala

This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-1-5dbd44ad-18bc-40f2-9dd6-aeb2cc23457c-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-3-27db2521-1e8b-40c1-b846-552cd620abce-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-4-f1b55628-0544-4833-8b11-1b4add53dfd6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-6-f75530ef-93b6-4994-b3c8-db957d44848c-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-7-8d9b22da-5f10-4cbf-8e4d-160f829b5e48-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-9-f029a1f7-9024-4bc3-a030-e20861586146-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-11-f07814ae-56cd-486b-af81-18541437da7d-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-12-967c70a4-bf4d-4a82-8c97-c90e2b4d9dcf-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-14-d0cdca7f-c050-407e-b70c-2bd076f83e4e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-15-0e931a1f-309e-43b3-a5cf-3ef82fa4a87c-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-17-43138078-244c-4b38-8127-04a5bfbc4695-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-19-52569895-df25-4ad8-b64d-49c4540d36c9-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-20-f160c1ea-a2f5-4109-81ec-3ff9c155430f-0.parquet
A 

[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16389 )

Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH 
clauses
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6365/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Gerrit-Change-Number: 16389
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 10:17:35 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16389 )

Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH 
clauses
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7051/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Gerrit-Change-Number: 16389
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 31 Aug 2020 10:13:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10051: impala-shell exits with ValueError with WITH clauses

2020-08-31 Thread Tamas Mate (Code Review)
Tamas Mate has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16389


Change subject: IMPALA-10051: impala-shell exits with ValueError with WITH 
clauses
..

IMPALA-10051: impala-shell exits with ValueError with WITH clauses

When a query a contains WITH clause impala-shell tries to identify whether
it is a DML query or not, so that later it can provide appropriate
result messages. Earlier shlex was used to create tokens and assess the
query type based on that. However shlex can misinterpret some query
strings where whitespace charachters are mixed with quotes, because it
splits the string based on whitespace charachters. In some scenarios
'ValueError: No closing quotation' error can occur.

This change moves the tokenization from shlex to sqlparse.

Testing:
 - Added unit test to cover queries that contain mixed whitespaces
   and strings

Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
---
M shell/impala_shell.py
M tests/shell/test_shell_interactive.py
2 files changed, 21 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/89/16389/2
--
To view, visit http://gerrit.cloudera.org:8080/16389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I442d3bc65b90a55c73c847948d5179a8586d71ad
Gerrit-Change-Number: 16389
Gerrit-PatchSet: 2
Gerrit-Owner: Tamas Mate 


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 24: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6364/


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 09:44:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 24:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7050/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 08:45:11 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 24: Code-Review+1

Thanks for the changes! I've just restarted the verify job on PS24.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 08:30:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 24:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6364/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 08:30:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#24). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..

IMPALA-9741: Support querying Iceberg table by impala

This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v1.metadata.json
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v2.metadata.json
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_partitioned/metadata/v1.metadata.json
A testdata/data/iceberg_test/iceberg_partitioned/metadata/v2.metadata.json
A testdata/data/iceberg_test/iceberg_partitioned/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
R testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-profile.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
R tests/query_test/test_iceberg.py
M tests/query_test/test_scanners.py
45 files changed, 1,436 insertions(+), 200 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/16143/24
--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 24
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 

[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16351 )

Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6363/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Gerrit-Change-Number: 16351
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 31 Aug 2020 08:17:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

2020-08-31 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16351 )

Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
..


Patch Set 4:

> Patch Set 4: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6361/

Hit IMPALA-9351. Rerun the test.


--
To view, visit http://gerrit.cloudera.org:8080/16351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Gerrit-Change-Number: 16351
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 31 Aug 2020 08:17:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 22:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/7049/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 22
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 06:44:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..


Patch Set 23:

Hi Tim,Zoltan, thanks for your review. I've been discussed with my colleague 
who is research Iceberg. And we found that we can generate iceberg data without 
hardcoded 'hdfs://localhost:20500' in metadata, so we regenerated these test 
files, and update patch. Now these json and avro files do not contain 
'hdfs://localhost:20500'. Would you please restart job to verify the patch?


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 23
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Mon, 31 Aug 2020 06:28:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support querying Iceberg table by impala

2020-08-31 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#22). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support querying Iceberg table by impala
..

IMPALA-9741: Support querying Iceberg table by impala

This patch mainly realizes the querying of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When querying iceberg table, we pushdown
partition column predicates to iceberg to decide which data files
need to be scanned, and then transfer this information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in test_scanners.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/AlterTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TruncateStmt.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v1.metadata.json
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/v2.metadata.json
A testdata/data/iceberg_test/iceberg_non_partitioned/metadata/version-hint.text
A testdata/data/iceberg_test/iceberg_partitioned/metadata/v1.metadata.json
A testdata/data/iceberg_test/iceberg_partitioned/metadata/v2.metadata.json
A testdata/data/iceberg_test/iceberg_partitioned/metadata/version-hint.text
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
R testdata/workloads/functional-query/queries/QueryTest/iceberg-create.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-negative.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-profile.test
A testdata/workloads/functional-query/queries/QueryTest/iceberg-query.test
M testdata/workloads/functional-query/queries/QueryTest/show-create-table.test
R tests/query_test/test_iceberg.py
M tests/query_test/test_scanners.py
45 files changed, 1,436 insertions(+), 200 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/43/16143/22
--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 22
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Anonymous Coward (606)
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 

[Impala-ASF-CR] IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state

2020-08-31 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16351 )

Change subject: IMPALA-10065: Fix DCHECK when retrying a query in FINISHED state
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6361/


--
To view, visit http://gerrit.cloudera.org:8080/16351
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11d82bf80640760a47325833463def8a3791bdda
Gerrit-Change-Number: 16351
Gerrit-PatchSet: 4
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sahil Takiar 
Gerrit-Comment-Date: Mon, 31 Aug 2020 06:06:12 +
Gerrit-HasComments: No