[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16206 )

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16206/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/16206/4/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@3150
PS4, Line 3150: loadAndSet
May be I misunderstand, why we do a loadAndSet here? The original intention was 
to not set the filemetadata in the table but rather get a copy of it to serve 
the client request based on a possibly different ValidWriteIdList. If you use 
loadAndSet doesn't it change the state of the table in catalogd?



--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Fri, 17 Jul 2020 05:46:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16123 )

Change subject: IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6142/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 03:48:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16123 )

Change subject: IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6631/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 03:37:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16123 )

Change subject: IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16123/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/16123/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3014
PS6, Line 3014: AnalyzesOk("select rank() over (order by int_col) from 
functional.alltypes intersect " +
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16123/6/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@3024
PS6, Line 3024:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 03:10:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

2020-07-16 Thread Shant Hovsepian (Code Review)
Hello David Rorke, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16123

to look at the new patch set (#6).

Change subject: IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.
..

IMPALA-9943, IMPALA-4974: INTERSECT and EXCEPT DISTINCT Support.

INTERSECT and EXCEPT set operations are implemented as rewrites to
joins. Currently only the DISTINCT qualified operators are implemented,
not ALL qualified. The operator MINUS is supported as an alias for
EXCEPT.

We mimic Hive's non-standard implementation which treats all operators
with the same precedence, as opposed to the SQL Standard of giving
INTERSECT higher precedence.

A new class SetOperationStmt was created to encompass the previous
UnionStmt behavior. UnionStmt is preserved as a special case of union
only operands to ensure compatibility with previous union planning
behavior.

Tests:
* Added parser and analyzer tests.
* Ensured no test failures or plan changes for union tests.
* Added TPC-DS queries 14,38,87 to functional and planner tests.
* Added functional tests test_intersect test_except

Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
A fe/src/main/java/org/apache/impala/analysis/SetOperationStmt.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/analysis/ValuesStmt.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/analysis/ParserTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/tpcds-all.test
A testdata/workloads/functional-query/queries/QueryTest/except.test
A testdata/workloads/functional-query/queries/QueryTest/intersect.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14-1.test
A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q14-2.test
A testdata/workloads/tpcds/queries/tpcds-q14-1.test
A testdata/workloads/tpcds/queries/tpcds-q14-2.test
A testdata/workloads/tpcds/queries/tpcds-q38.test
A testdata/workloads/tpcds/queries/tpcds-q87.test
M testdata/workloads/tpch_nested/tpch_nested_core.csv
M tests/query_test/test_queries.py
M tests/query_test/test_tpcds_queries.py
M tests/util/parse_util.py
29 files changed, 3,725 insertions(+), 790 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/16123/6
--
To view, visit http://gerrit.cloudera.org:8080/16123
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5be46f824217218146ad48b30767af0fc7edbc0f
Gerrit-Change-Number: 16123
Gerrit-PatchSet: 6
Gerrit-Owner: Shant Hovsepian 
Gerrit-Reviewer: David Rorke 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Shant Hovsepian 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6630/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 11
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Fri, 17 Jul 2020 02:35:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#11). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..

IMPALA-9741: Support quering Icebreg table by impala

This patch mainly realizes the query of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When query iceberg table, we pushdown
partition column predicates to iceberg to decided which data files
need to be scanned, and then transformed these information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in custom cluster test test_iceberg.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet
A 

[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16204 )

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6629/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 01:47:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16204 )

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6628/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 01:40:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16204 )

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16204/2/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/16204/2/common/thrift/ImpalaService.thrift@547
PS2, Line 547:   // Sets an upper limit on the number of fs writer instances to 
be scheduled during
> line too long (92 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16204/2/tests/custom_cluster/test_mt_dop.py
File tests/custom_cluster/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/16204/2/tests/custom_cluster/test_mt_dop.py@21
PS2, Line 21:
> flake8: F401 're' imported but unused
Done



--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 01:20:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Bikramjeet Vig (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16204

to look at the new patch set (#3).

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..

IMPALA-8125: Add query option to limit number of hdfs writer instances

This patch adds a new query option MAX_FS_WRITERS that limits the
number of HDFS writer instances.

Highlights:
- Depending on the plan, it either restricts the num of instances of
  the root fragment or adds an exchange and then limits the num of
  instances of that.
- Assigns instances evenly across available backends.
- "no-shuffle" query hint is ignored when using query option.
- Change in behavior of plans is only when this query option is used.
- The only exception to the previous point is that the optimization
  logic that decides to add an exchange now looks at the num of
  instances instead of the number of nodes.

Testing:
- Adding planner tests to cover all cases where this enforcement kicks
  in and to highlight the behavior.
- Added e2e tests to confirm that the scheduler is enforcing the limit
  and distributing the instance evenly across backends for different
  plan shapes.

Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
---
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test
M tests/query_test/test_insert.py
16 files changed, 892 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/16204/3
--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 3
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16204 )

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16204/2/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/16204/2/common/thrift/ImpalaService.thrift@547
PS2, Line 547:   // Sets an upper limit on the number of fs writer instances to 
be scheduled during insert.
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/16204/2/tests/custom_cluster/test_mt_dop.py
File tests/custom_cluster/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/16204/2/tests/custom_cluster/test_mt_dop.py@21
PS2, Line 21: import re
flake8: F401 're' imported but unused



--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 17 Jul 2020 01:13:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8125: Add query option to limit number of hdfs writer instances

2020-07-16 Thread Bikramjeet Vig (Code Review)
Hello Aman Sinha, Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16204

to look at the new patch set (#2).

Change subject: IMPALA-8125: Add query option to limit number of hdfs writer 
instances
..

IMPALA-8125: Add query option to limit number of hdfs writer instances

This patch adds a new query option MAX_FS_WRITERS that limits the
number of HDFS writer instances.

Highlights:
- Depending on the plan, it either restricts the num of instances of
  the root fragment or adds an exchange and then limits the num of
  instances of that.
- Assigns instances evenly across available backends.
- "no-shuffle" query hint is ignored when using query option.
- Change in behavior of plans is only when this query option is used.
- The only exception to the previous point is that the optimization
  logic that decides to add an exchange now looks at the num of
  instances instead of the number of nodes.

Testing:
- Adding planner tests to cover all cases where this enforcement kicks
  in and to highlight the behavior.
- Added e2e tests to confirm that the scheduler is enforcing the limit
  and distributing the instance evenly across backends for different
  plan shapes.

Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
---
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test
M tests/custom_cluster/test_mt_dop.py
M tests/query_test/test_insert.py
17 files changed, 893 insertions(+), 34 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/16204/2
--
To view, visit http://gerrit.cloudera.org:8080/16204
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I17c8e61b9a32d908eec82c83618ff9caa41078a5
Gerrit-Change-Number: 16204
Gerrit-PatchSet: 2
Gerrit-Owner: Bikramjeet Vig 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Fang-Yu Rao (Code Review)
Fang-Yu Rao has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 6: Code-Review+1

> Patch Set 6:
>
> (1 comment)

Thanks Adam!


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 22:59:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 22:31:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..

IMPALA-9833: Bump timeout in TestQueryStates.test_error_query_state

Increases the timeout in
query_test.test_observability.TestQueryStates.test_error_query_state
from 30 seconds to 300 seconds.

Testing:
* Unable to reproduce the issue locally, but looped the test fix for an
  hour with 8 concurrent streams

Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Reviewed-on: http://gerrit.cloudera.org:8080/16205
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M tests/query_test/test_observability.py
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 3
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 


[Impala-ASF-CR] IMPALA-7655: Implement codegen for conditional functions (if, isnull, coalesce)

2020-07-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16208 )

Change subject: IMPALA-7655: Implement codegen for conditional functions (if, 
isnull, coalesce)
..


Patch Set 1:

(3 comments)

I'll wait for the tests and perf benchmarks but the code looks good. Thanks for 
picking this up.

Hopefully we already have pretty good test coverage for these functions, but I 
haven't checked myself.

http://gerrit.cloudera.org:8080/#/c/16208/1/be/src/exprs/conditional-functions.cc
File be/src/exprs/conditional-functions.cc:

http://gerrit.cloudera.org:8080/#/c/16208/1/be/src/exprs/conditional-functions.cc@20
PS1, Line 20: #include "exprs/conditional-functions.h"
nit: convention is to put the corresponding header first in a separate block.


http://gerrit.cloudera.org:8080/#/c/16208/1/be/src/exprs/conditional-functions.cc@21
PS1, Line 21: #include "runtime/runtime-state.h"
Do we need these includes?


http://gerrit.cloudera.org:8080/#/c/16208/1/be/src/exprs/conditional-functions.cc@84
PS1, Line 84:   // TODO: Can we use non-const size arrays? Applies also to 
CaseExpr.
It'd be best to use a std:: vector for unbounded variable-length arrays. This 
style of VLAs is not officially part of C++ and there is the risk of stack 
overflows at runtime.



--
To view, visit http://gerrit.cloudera.org:8080/16208
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11f617a9148492ccafb46112ce0af103a10090f8
Gerrit-Change-Number: 16208
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Jul 2020 20:57:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7655: Implement codegen for conditional functions (if, isnull, coalesce)

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16208 )

Change subject: IMPALA-7655: Implement codegen for conditional functions (if, 
isnull, coalesce)
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6627/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16208
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I11f617a9148492ccafb46112ce0af103a10090f8
Gerrit-Change-Number: 16208
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 16 Jul 2020 20:29:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16206 )

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6626/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 20:05:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7655: Implement codegen for conditional functions (if, isnull, coalesce)

2020-07-16 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16208


Change subject: IMPALA-7655: Implement codegen for conditional functions (if, 
isnull, coalesce)
..

IMPALA-7655: Implement codegen for conditional functions (if, isnull, coalesce)

Implement proper codegen for conditional functions (if, isnull,
coalesce) instead of simply calling into interpreted code. We use
IRBuilder to generate hand-crafted code.

Change-Id: I11f617a9148492ccafb46112ce0af103a10090f8
---
M be/src/exprs/conditional-functions.cc
1 file changed, 149 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/08/16208/1
--
To view, visit http://gerrit.cloudera.org:8080/16208
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I11f617a9148492ccafb46112ce0af103a10090f8
Gerrit-Change-Number: 16208
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16206 )

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6625/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 20:00:50 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16206 )

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6624/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:49:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6623/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:48:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6621/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:48:07 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16206 )

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6622/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:47:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16206

to look at the new patch set (#3).

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..

IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

IMPALA-9859 added separate fields for insert and delete file
descriptors. They are needed for full ACID tables.
I did not set these in CatalogServiceCatalog.setFileMetadataFromFS
which could result in a NullPointerException in CatalogdMetaProvider.

During the fix I found another bug related to delete delta files. In
AcidUtils we did not filter them based on the valid write id list. I
fixed this issue as well in this commit.

Added unit tests about the issues.

Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
6 files changed, 122 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16206/3
--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Quanlong Huang, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16206

to look at the new patch set (#4).

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..

IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

IMPALA-9859 added separate fields for insert and delete file
descriptors. They are needed for full ACID tables.
I did not set these in CatalogServiceCatalog.setFileMetadataFromFS
which could result in a NullPointerException in CatalogdMetaProvider.

During the fix I found another bug related to delete delta files. In
AcidUtils we did not filter them based on the valid write id list. I
fixed this issue as well in this commit.

Added unit tests about the issues.

Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
6 files changed, 121 insertions(+), 30 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16206/4
--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Zoltan Borok-Nagy (Code Review)
Hello Quanlong Huang,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16206

to look at the new patch set (#2).

Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..

IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

IMPALA-9859 added separate fields for insert and delete file
descriptors. They are needed for full ACID tables.
I did not set these in CatalogServiceCatalog.setFileMetadataFromFS
which could result in a NullPointerException in CatalogdMetaProvider.

During the fix I found another bug related to delete delta files. In
AcidUtils we did not filter them based on the valid write id list. I
fixed this issue as well in this commit.

Added unit tests about the issues.

Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
6 files changed, 122 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16206/2
--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py@831
PS3, Line 831: e
flake8: E501 line too long (101 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py@839
PS3, Line 839: s
flake8: E501 line too long (101 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py@841
PS3, Line 841: ,
flake8: E501 line too long (104 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py@846
PS3, Line 846: T
flake8: E501 line too long (95 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:19:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16199/3/tests/authorization/test_ranger.py@841
PS3, Line 841: "grant {0} on 
database {1} to user {2} with "
> The variable of 'result' is not used afterwards. Maybe we could remove it.
Ah, I missed it.
Thank you for bringing this to my attention.

Done.



--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 19:19:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..

IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

In "show tables" ANY privilege was used, whereas in "show functions"
the required privilege was VIEW_METADATA.
To solve the inconsistency "show functions" will use ANY instead of
VIEW_METADATA similar to "show tables".

After this, an user granted only the privilege of CREATE is now able to
execute "show functions" after this patch, making it easier for the
user to manage the functions it creates.

Testing:
-Ran CORE tests.
-Added new tests to check the privilege.

Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M tests/authorization/test_ranger.py
4 files changed, 45 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/3
--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 3
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

2020-07-16 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16206


Change subject: IMPALA-9964: Fill file descriptors properly in 
setFileMetadataFromFS
..

IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS

IMPALA-9859 added separate fields for insert and delete file
descriptors. They are needed for full ACID tables.
I did not set these in CatalogServiceCatalog.setFileMetadataFromFS
which could result in a NullPointerException in CatalogdMetaProvider.

During the fix I found another bug related to delete delta files. In
AcidUtils we did not filter them based on the valid write id list. I
fixed this issue as well in this commit.

Added unit tests about the issues.

Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
---
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/ParallelFileMetadataLoader.java
M fe/src/main/java/org/apache/impala/util/AcidUtils.java
M fe/src/test/java/org/apache/impala/catalog/PartialCatalogInfoWriteIdTest.java
M fe/src/test/java/org/apache/impala/util/AcidUtilsTest.java
6 files changed, 122 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/16206/1
--
To view, visit http://gerrit.cloudera.org:8080/16206
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2927171cf426597c86766fb83d565c5e57025c73
Gerrit-Change-Number: 16206
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..

IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

In "show tables" ANY privilege was used, whereas in "show functions"
the required privilege was VIEW_METADATA.
To solve the inconsistency "show functions" will use ANY instead of
VIEW_METADATA similar to "show tables".

After this, an user granted only the privilege of CREATE is now able to
execute "show functions" after this patch, making it easier for the
user to manage the functions it creates.

Testing:
-Ran CORE tests.
-Added new tests to check the privilege.

Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M tests/authorization/test_ranger.py
4 files changed, 53 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/6
--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 6
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 16 Jul 2020 18:30:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6620/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 17:18:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6141/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 17:15:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 2
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 17:15:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16205 )

Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Xiaomeng Zhang 
Gerrit-Comment-Date: Thu, 16 Jul 2020 17:04:46 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6619/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 17:03:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9833: Bump timeout in TestQueryStates.test error query state

2020-07-16 Thread Sahil Takiar (Code Review)
Sahil Takiar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/16205


Change subject: IMPALA-9833: Bump timeout in 
TestQueryStates.test_error_query_state
..

IMPALA-9833: Bump timeout in TestQueryStates.test_error_query_state

Increases the timeout in
query_test.test_observability.TestQueryStates.test_error_query_state
from 30 seconds to 300 seconds.

Testing:
* Unable to reproduce the issue locally, but looped the test fix for an
  hour with 8 concurrent streams

Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
---
M tests/query_test/test_observability.py
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/05/16205/1
--
To view, visit http://gerrit.cloudera.org:8080/16205
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id48264c1a518c4d9baec89da75170d84f5ad55c2
Gerrit-Change-Number: 16205
Gerrit-PatchSet: 1
Gerrit-Owner: Sahil Takiar 


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..

IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

In "show tables" ANY privilege was used, whereas in "show functions"
the required privilege was VIEW_METADATA.
To solve the inconsistency "show functions" will use ANY instead of
VIEW_METADATA similar to "show tables".

After this, an user granted only the privilege of CREATE is now able to
execute "show functions" after this patch, making it easier for the
user to manage the functions it creates.

Testing:
-Ran CORE tests.
-Added new tests to check the privilege.

Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M tests/authorization/test_ranger.py
4 files changed, 53 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/5
--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 5
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6618/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 16:30:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 4:

> Hi Adam, thanks for working on this patch!
 >
 > The patch looks good to me since you have implemented what Fredy
 > had suggested at https://issues.apache.org/jira/browse/IMPALA-7001.
 > I only have two minor comments regarding the test and the commit
 > message.
 >
 > Specifically, after your patch, a user granted only the privilege
 > of CREATE on a specified database, e.g., functional, would be able
 > to execute a statement like "SHOW FUNCTIONS IN functional", since
 > according to 
 > https://github.com/apache/impala/blob/master/fe/src/main/java/org/apache/impala/authorization/Privilege.java
 > and 
 > https://github.com/apache/impala/blob/3a6022ce80ca1cedb629400b18caaf0d1f54137c/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java#L431-L453,
 > such a statement would succeed as long as the user is granted any
 > privilege in the set {ALL, OWNER, ALTER, DROP, CREATE, INSERT,
 > SELECT, REFRESH}.
 >
 > Before your patch, in order for the statement above to succeed, a
 > user has to be granted any privilege in the set {INSERT, SELECT,
 > REFRESH}. Thus I think it would be good to add one more test case
 > in 
 > https://github.com/apache/impala/blob/master/tests/authorization/test_ranger.py,
 > where we 1) grant the privilege of CREATE to a user (as
 > admin_client), and 2)  execute a statement like "SHOW FUNCTIONS IN
 > unique_database" to verify there is no exception thrown.
 >
 > On the other hand, I think it may also be good to provide more
 > detail of the difference before and after the patch. For instance,
 > we could mention that a user granted only the privilege of CREATE
 > is now able to execute that SQL statement above after this patch,
 > making it easier for the user to manage the functions it creates.

Hi and thank you for the review, I updated it based on the suggestions.


--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 16:29:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9889: Fixed flaky test runtime filters on Kudu table

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16191 )

Change subject: IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6617/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b
Gerrit-Change-Number: 16191
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Thu, 16 Jul 2020 16:15:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/16199/4/tests/authorization/test_ranger.py
File tests/authorization/test_ranger.py:

http://gerrit.cloudera.org:8080/#/c/16199/4/tests/authorization/test_ranger.py@832
PS4, Line 832: e
flake8: E501 line too long (101 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/4/tests/authorization/test_ranger.py@841
PS4, Line 841: s
flake8: E501 line too long (101 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/4/tests/authorization/test_ranger.py@844
PS4, Line 844: ,
flake8: E501 line too long (104 > 90 characters)


http://gerrit.cloudera.org:8080/#/c/16199/4/tests/authorization/test_ranger.py@851
PS4, Line 851: T
flake8: E501 line too long (95 > 90 characters)



--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 16 Jul 2020 16:12:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

2020-07-16 Thread Adam Tamas (Code Review)
Adam Tamas has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/16199 )

Change subject: IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES 
and SHOW FUNCTIONS
..

IMPALA-7001: Fix Privilege inconsistency between SHOW TABLES and SHOW FUNCTIONS

In "show tables" ANY privilege was used, whereas in "show functions"
the required privilege was VIEW_METADATA.
To solve the inconsistency "show functions" will use ANY instead of
VIEW_METADATA similar to "show tables".

After this, an user granted only the privilege of CREATE is now able to
execute "show functions" after this patch, making it easier for the
user to manage the functions it creates.

Testing:
-Ran CORE tests.
-Added new tests to check the privilege.

Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
---
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/ShowFunctionsStmt.java
M fe/src/test/java/org/apache/impala/analysis/AuditingTest.java
M tests/authorization/test_ranger.py
4 files changed, 50 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/99/16199/4
--
To view, visit http://gerrit.cloudera.org:8080/16199
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ae7546c206daaf98ecc3de449069027c43c6e1a
Gerrit-Change-Number: 16199
Gerrit-PatchSet: 4
Gerrit-Owner: Adam Tamas 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-9889: Fixed flaky test runtime filters on Kudu table

2020-07-16 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/16191 )

Change subject: IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table
..

IMPALA-9889: Fixed flaky test_runtime_filters on Kudu table

Test cases in test_runtime_filters failed occasionally in ASAN
builds due to runtime filters not arriving scan nodes in time.
Query profiles showed that codegen took 2 to 4 minutes for one
fragment when this issue happened. This caused hash join nodes
waiting long time to generate and publish runtime filters, hence
arrival delay on scan nodes. To avoid the delay, turn on
ASYNC_CODEGEN for test_runtime_filters agaiest Kudu table when
test runs for ASAN build.

Testing:
 - Passed core test for regular debug build and ASAN build.

Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b
---
M tests/query_test/test_runtime_filters.py
1 file changed, 19 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/91/16191/4
--
To view, visit http://gerrit.cloudera.org:8080/16191
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94a08e272f0870c04c96563fa614e3416fb5379b
Gerrit-Change-Number: 16191
Gerrit-PatchSet: 4
Gerrit-Owner: Wenzhe Zhou 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6616/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 16 Jul 2020 13:51:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6140/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 16 Jul 2020 13:24:23 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Quanlong Huang (Code Review)
Hello Anurag Mantripragada, Vihang Karajgaonkar, Tim Armstrong, Impala Public 
Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16159

to look at the new patch set (#3).

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..

IMPALA-3127: Support incremental metadata updates in partition level

Currently, partitions are tightly integrated into the HdfsTable objects.
Catalogd has to transmit the entire table metadata even when few
partitions change. This is a waste of resources and can lead to OOM in
transmitting large tables due to the 2GB JVM array limit.

This patch makes HdfsPartition extend CatalogObject so the catalogd can
send partitions as individual catalog objects. Consequently, table
objects in the catalog topic update can have minimal partition maps that
only contain the partition ids, which reduces the thrift object size for
large tables. The catalog object key of HdfsPartition consists of db
name, table name and partition name.

In "full" topic mode (catalog_topic_mode=full), catalogd only sends
changed partitions with their latest table states. The latest table
states are table objects with the minimal partition map. Legacy
coordinators use the partition list to pick up existing (unchanged)
partitions from the existing table object and new partitions in the
catalog update.

Currently, partition instances are immutable - all partition
modifications are implemented by deleting the old instance and adding a
new one with a new partition id. Since partition ids are generated by a
global counter. Newer partition instances will have larger partition
ids. So catalogd maintains a watermark for each table as the max sent
partition id. Partition instances with ids larger than this are new
partitions that should be sent in the next catalog update. For the
deleted partition instances, they are kept in a set for each table until
the next catalog update. If there are no updates on the same partition
name, catalogd will send deletion on the partition.

For dropped or invalidated tables, catalogd will still send deletions on
their partitions. Although they are not used in coordinators
(coordinators delete the partitions when they delete the table
instances), they help in avoiding topic entry leak in the statestore
catalog topic.

In "minimal" topic mode (catalog_topic_mode=minimal), catalogd only
sends invalidations on tables and stale partition instances. Each
partition instance is identified by its partition id. LocalCatalog
coordinators use the partition invalidations to evict stale partitions
in time. For instance, let's say partition(year=2010) is updated in
catalogd. This is done by deleting the old partition instance
partition(id=0, year=2010) and adding a new partition instance
partition(id=1, year=2010). Catalogd will send invalidations on the
table and partition instance with id=0, but not the one with id=1. A
LocalCatalog coordinator will invalidate the partition instance(id=0) if
it's in the cache. If the partition instance(id=1) is cached, it's
already the latest version since partition instances are immutable. So
we don't need to invalidate it.

Tests
 - Run exhaustive tests.
 - Run exhaustive test_ddl.py in LocalCatalog mode.
 - (TODO) Add tests on long statestore update frequency that several
   table changes are sent in the same topic update.
 - (TODO) Add tests on straggler coordinators that need to process
   several incremental updates at once.
 - (TODO) Add tests on no statestore topic entry leak.

Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
---
M be/src/catalog/catalog-util.cc
M common/thrift/CatalogObjects.thrift
M fe/src/main/java/org/apache/impala/catalog/Catalog.java
M fe/src/main/java/org/apache/impala/catalog/CatalogObject.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/ImpaladCatalog.java
M fe/src/main/java/org/apache/impala/catalog/Table.java
M fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
12 files changed, 502 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/16159/3
--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 

[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..


Patch Set 2:

(2 comments)

> Patch Set 2: Verified-1
>
> Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6139/

The failure is related to IMPALA-9833

http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1574
PS2, Line 1574: for (Map.Entry part : 
hdfsTable.getPartitions().entrySet()) {
> line too long (91 > 90)
Done


http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/16159/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4241
PS2, Line 4241: // TODO(IMPALA-9937): if client is a 'v1' impalad, only 
send back incremental updates
> line too long (93 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 16 Jul 2020 13:22:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3127: Support incremental metadata updates in partition level

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16159 )

Change subject: IMPALA-3127: Support incremental metadata updates in partition 
level
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6139/


--
To view, visit http://gerrit.cloudera.org:8080/16159
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia0abfb346903d6e7cdc603af91c2b8937d24d870
Gerrit-Change-Number: 16159
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Anurag Mantripragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Thu, 16 Jul 2020 08:39:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 10:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6615/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 10
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:58:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 8:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6613/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:54:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/6614/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:54:52 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#10). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..

IMPALA-9741: Support quering Icebreg table by impala

This patch mainly realizes the query of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When query iceberg table, we pushdown
partition column predicates to iceberg to decided which data files
need to be scanned, and then transformed these information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in custom cluster test test_iceberg.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet
A 

[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 9:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16143/9/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/9/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@208
PS9, Line 208: pathMD5ToFileDescMap_ = 
Utils.loadAllPartition(msTable_.getSd().getLocation(), this);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16143/9/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@277
PS9, Line 277: pathMD5ToFileDescMap_ = 
loadFileDescFromThrift(ticeberg.getPath_md5_to_file_descriptors());
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/16143/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16143/9/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@102
PS9, Line 102:   
icebergTable_.getPathMD5ToFileDescMap().get(IcebergUtil.getDataFileMD5(dataFile));
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 9
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:29:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 8:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16143/8/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/IcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/8/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@208
PS8, Line 208: pathMD5ToFileDescMap_ = 
Utils.loadAllPartition(msTable_.getSd().getLocation(), this);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/16143/8/fe/src/main/java/org/apache/impala/catalog/IcebergTable.java@277
PS8, Line 277: pathMD5ToFileDescMap_ = 
loadFileDescFromThrift(ticeberg.getPath_md5_to_file_descriptors());
line too long (95 > 90)


http://gerrit.cloudera.org:8080/#/c/16143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/16143/8/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@102
PS8, Line 102:   
icebergTable_.getPathMD5ToFileDescMap().get(IcebergUtil.getDataFileMD5(dataFile));
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:28:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..

IMPALA-9741: Support quering Icebreg table by impala

This patch mainly realizes the query of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't specify this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When query iceberg table, we pushdown
partition column predicates to iceberg to decided which data files
need to be scanned, and then transformed these information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in custom cluster test test_iceberg.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet
A 

[Impala-ASF-CR] IMPALA-9741: Support quering Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Support quering Icebreg table by impala
..


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG@7
PS6, Line 7: Icebreg
> It's still misspelled
Done


http://gerrit.cloudera.org:8080/#/c/16143/6//COMMIT_MSG@26
PS6, Line 26: identity
> specify
Done



--
To view, visit http://gerrit.cloudera.org:8080/16143
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
Gerrit-Change-Number: 16143
Gerrit-PatchSet: 8
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Thu, 16 Jul 2020 06:28:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Supported query Icebreg table by impala
..

IMPALA-9741: Supported query Icebreg table by impala

This patch mainly realizes the query of iceberg table through impala,
we can use the following sql to create an external iceberg table:
CREATE EXTERNAL TABLE default.iceberg_test (
level string,
event_time timestamp,
message string,
)
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
Or just including table name and location like this:
CREATE EXTERNAL TABLE default.iceberg_test
STORED AS ICEBERG
LOCATION 'hdfs://xxx'
TBLPROPERTIES ('iceberg_file_format'='parquet');
'iceberg_file_format' is the file format in iceberg, currently only
support PARQUET, other format would be supported in the future. And
if you don't identity this property in your SQL, default file format
is PARQUET.

We achieved this function by treating the iceberg table as normal
unpartitioned hdfs table. When query iceberg table, we pushdown
partition column predicates to iceberg to decided which data files
need to be scanned, and then transformed these information to BE to
do the real scan operation.

Testing:
- Unit test for Iceberg in FileMetadataLoaderTest
- Create table tests in functional_schema_template.sql
- Iceberg table query test in custom cluster test test_iceberg.py

Change-Id: I856cfee4f3397d1a89cf17650e8d4fbfe1f2b006
---
M be/src/runtime/descriptors.cc
M bin/rat_exclude_files.txt
M common/thrift/CatalogObjects.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
M fe/src/main/java/org/apache/impala/analysis/IcebergPartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/ShowFilesStmt.java
M fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
M fe/src/main/java/org/apache/impala/catalog/IcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsTable.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
A fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/SingleNodePlanner.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M fe/src/main/java/org/apache/impala/util/IcebergUtil.java
M fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
M testdata/data/README
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/1-100-e1a80ed6-1064-494d-9cdd-c4a30c1ab8dc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/3-102-511427f2-85f0-43ae-9b39-a456f8dc57b6-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/4-103-00fc55e1-6ef7-4241-ace2-6d075b9737fc-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/6-105-ef9e76d5-c060-4040-8aa1-b7c275610daa-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/7-106-c09c9c8d-9478-44f9-8501-f85f53112bc3-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/9-108-3b4f06ac-dca3-4f4e-be60-bf42d9927b5b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00011-110-1e653ccf-0963-4fb0-941c-32c9de13268b-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00012-111-dfa70658-eb4b-4fa0-9ffa-b892cf90d6ac-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00014-113-2d16e751-e2a4-4856-ab89-145996e3815e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00015-114-0f710621-cbbf-4509-a93d-b58808978e2e-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00017-116-0b666c79-53df-4507-906c-542e65a83443-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00019-118-1bc6bc6e-e061-4da3-9d1e-a427a306c471-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00020-119-ae7b2c67-1538-4429-8246-4998960e3817-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00022-121-8db0f1e1-d88c-4aad-a8b3-24fd07329cdb-0.parquet
A 
testdata/data/iceberg_test/iceberg_non_partitioned/data/00023-122-de57b6b0-f54b-40ac-85cd-e783505094b6-0.parquet
A 

[Impala-ASF-CR] IMPALA-9741: Supported query Icebreg table by impala

2020-07-16 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16143 )

Change subject: IMPALA-9741: Supported query Icebreg table by impala
..


Patch Set 8:

(13 comments)

Done!

http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@512
PS6, Line 512: column_to_sourc
> nit: column_to_source_id ?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@515
PS6, Line 515: source_id_to_partition
> The mapping is reversed. Name it "source_id_to_partition" ?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/common/thrift/CatalogObjects.thrift@516
PS6, Line 516: map path_md5_to_file
> Please follow the above conventions for naming maps.
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java
File fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/IcebergPartitionField.java@28
PS6, Line 28:   // The id of the source column in the Iceberg table schema. The 
source column is
:   // used as the input for this partition field.
> Might worth rewording it a bit:
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java
File fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/analysis/ShowStatsStmt.java@88
PS6, Line 88: if (table_ instanceof FeIcebergTable) {
:   if (((FeIcebergTable) 
table_).getSourceIdToPartitionMap().isEmpty()) {
: notPartitioned = true;
:   }
> Probably we should treat all Iceberg tables as not partitioned, since it's
Yes, you are right, we treated iceberg table as unpartitioned hdfs table, but 
iceberg table still has it's own partition info, we get this info by 'show 
partitions xxx.iceberg_table_test' like this:

+--+---+--++---+
| Partition Id | Source Id | Field Id | Field Name | Field Partition Transform |
+--+---+--++---+
| 0| 2 | 1000 | sex| IDENTITY  |
| 0| 3 | 1001 | action | IDENTITY  |
+--+---+--++---+

If I set 'notPartitioned' as true, even if getPartitionColToSourceIdMap() is 
not empty, how can I get the iceberg partition info? 'show partitions 
xxx.iceberg_table_test' will always return AnalysisException.


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
File fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@66
PS6, Line 66: getPathMD5ToFi
> nit: getPartitionToFileDescMap
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@219
PS6, Line 219: isPartitioned(Fe
> nit: isPartitioned?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@258
PS6, Line 258: PartitionColToSourceId
> It returns a mapping from source ids to partition columns, therefore please
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@271
PS6, Line 271: getColumnToSourc
> nit: getColumnToSourceIdMap?
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java@305
PS6, Line 305:   
> nit: wrong indentation
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
File fe/src/main/java/org/apache/impala/util/IcebergUtil.java:

http://gerrit.cloudera.org:8080/#/c/16143/6/fe/src/main/java/org/apache/impala/util/IcebergUtil.java@114
PS6, Line 114: if ("PARQUET".equalsIgnoreCase(format)) return 
TIcebergFileFormat.PARQUET;
 : return null;
 :   }
 :
 :   /**
 :* Build TIceb
> How about:
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py
File testdata/bin/generate-schema-statements.py:

http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@193
PS6, Line 193:   }
> You probably don't need to modify this file. I think adding HUDIPARQUET to
Done


http://gerrit.cloudera.org:8080/#/c/16143/6/testdata/bin/generate-schema-statements.py@766
PS6, Line 766:
> flake8: E501 line too