[Impala-ASF-CR] IMPALA-10510: Change code to help with third party extensions

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17079 )

Change subject: IMPALA-10510: Change code to help with third party extensions
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Gerrit-Change-Number: 17079
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 23 Feb 2021 04:37:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10510: Change code to help with third party extensions

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17079 )

Change subject: IMPALA-10510: Change code to help with third party extensions
..

IMPALA-10510: Change code to help with third party extensions

Made ArithmeticExpr.Operator public and made various changes to
HdfsTable which will allow third party extensions greater visibility.

Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Reviewed-on: http://gerrit.cloudera.org:8080/17079
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
2 files changed, 21 insertions(+), 12 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/17079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Gerrit-Change-Number: 17079
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Vihang Karajgaonkar 


[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17104 )

Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of 
externally compiled statements
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8202/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Gerrit-Change-Number: 17104
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 23 Feb 2021 04:30:13 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10504: Add tracing for remote block reads

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17062 )

Change subject: IMPALA-10504: Add tracing for remote block reads
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8201/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c6a3e92f44813048022edf2b91299b3b0a20257
Gerrit-Change-Number: 17062
Gerrit-PatchSet: 6
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 23 Feb 2021 04:25:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10535: Add interface to ImpalaServer for execution of externally compiled statements

2021-02-22 Thread Kurt Deschler (Code Review)
Hello j...@cloudera.com, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17104

to look at the new patch set (#3).

Change subject: IMPALA-10535: Add interface to ImpalaServer for execution of 
externally compiled statements
..

IMPALA-10535: Add interface to ImpalaServer for execution of externally 
compiled statements

The ExecutePlannedStatement interface allows an externally supplied
TExecRequest to be executed by impalad. The TExecRequest must be fully
populated and will be sent directly to the backend for execution.

In order to add the interface to ImpalaInternalService.thrift, several of
the thrift classes were moved to Query.thrift to avoid a circular
dependency with Frontend.thrift.

Added functionality to format and dump TExecRequest structures to path
specified in debug flag dump_exec_request_path.

A start timestamp field has been added to TExecRequest to represent the
interval in the query profile between when the request was sent by the
external frontend and handled by the backend.

A local timestamp field has been added to the Ping result struct to
return the current backend timestamp. This is used by the external to
frontend to populate the start timestamp.

Also included is a change to avoid generating silent AnalysisExceptions
during table resolution.

Tested with TExecRequest structures populated by external frontend.

Reviewed-by: John Sherman 
Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
---
M be/generated-sources/gen-cpp/CMakeLists.txt
M be/src/rpc/hs2-http-test.cc
M be/src/runtime/debug-options.h
M be/src/runtime/query-driver.cc
M be/src/runtime/query-driver.h
M be/src/service/client-request-state.cc
M be/src/service/client-request-state.h
M be/src/service/impala-beeswax-server.cc
M be/src/service/impala-hs2-server.cc
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/thrift/CMakeLists.txt
M common/thrift/Frontend.thrift
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A common/thrift/Query.thrift
M fe/pom.xml
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/ResetMetadataStmt.java
M fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
21 files changed, 943 insertions(+), 755 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/04/17104/3
--
To view, visit http://gerrit.cloudera.org:8080/17104
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iace716dd67290f08441857dc02d2428b0e335eaa
Gerrit-Change-Number: 17104
Gerrit-PatchSet: 3
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10504: Add tracing for remote block reads

2021-02-22 Thread Kurt Deschler (Code Review)
Hello Aman Sinha, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17062

to look at the new patch set (#6).

Change subject: IMPALA-10504: Add tracing for remote block reads
..

IMPALA-10504: Add tracing for remote block reads

This patch logs metadata for the first unexpected remote read of each
scanrange when the flag fs_trace_remote_reads is set to true. This
logging is intended to help diagnose the root cause of remote reads.

Since a message may be logged for each scan range, there could be
several hundred lines of output in a degenerate case. However, the
remote read condition is not expected and verbose output may be needed
to diagnose the root cause.

Reviewed-by: Aman Sinha 
Change-Id: I8c6a3e92f44813048022edf2b91299b3b0a20257
---
M be/src/runtime/io/hdfs-file-reader.cc
M be/src/runtime/io/hdfs-file-reader.h
2 files changed, 44 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/62/17062/6
--
To view, visit http://gerrit.cloudera.org:8080/17062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8c6a3e92f44813048022edf2b91299b3b0a20257
Gerrit-Change-Number: 17062
Gerrit-PatchSet: 6
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 


[Impala-ASF-CR] IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17095 )

Change subject: IMPALA-9767: Do not clean up filter while PublishFilter is 
ongoing
..


Patch Set 3: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
Gerrit-Change-Number: 17095
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Tue, 23 Feb 2021 04:00:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10524: Changes to HdfsPartition for third party extensions.

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17092 )

Change subject: IMPALA-10524: Changes to HdfsPartition for third party 
extensions.
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8200/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a792642f27228118ac8f2e8ef98e8ba7aee4a46
Gerrit-Change-Number: 17092
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 23 Feb 2021 03:01:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8199/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:59:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6913/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:53:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:53:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:52:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10524: Changes to HdfsPartition for third party extensions.

2021-02-22 Thread Steve Carlin (Code Review)
Steve Carlin has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/17092 )

Change subject: IMPALA-10524: Changes to HdfsPartition for third party 
extensions.
..

IMPALA-10524: Changes to HdfsPartition for third party extensions.

Some changes are needed to HdfsPartition and other related classes
to allow for third party extensions.  These changes include:

- A protected constructor which will allow a subclass to instantiate
  HdfsPartition using its own Builder.
- Various changes of permissions to methods and variables to allow
  third party extension visibility.
- Creation of the getHostIndex() method to allow the subclass to
  override how the hostIndexes are retrieved.
- Added a new default method "getFileSystem()" to FeFsPartition which
  will allow the third party extension to override how the filesystem
  is obtained from the partition object.

Change-Id: I5a792642f27228118ac8f2e8ef98e8ba7aee4a46
---
M fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
M fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
M 
fe/src/main/java/org/apache/impala/catalog/HdfsPartitionLocationCompressor.java
M fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
6 files changed, 51 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/92/17092/5
--
To view, visit http://gerrit.cloudera.org:8080/17092
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5a792642f27228118ac8f2e8ef98e8ba7aee4a46
Gerrit-Change-Number: 17092
Gerrit-PatchSet: 5
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Steve Carlin (Code Review)
Steve Carlin has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..

IMPALA-10525: Add param to BuiltinsDb to defer initialization

BuiltinsDb currently initializes all the builtin functions on
initialization. Part of the initialization task is to interact with
the C++ code to fetch the signatures of the functions.  This doesn't
work if a third party extension wants to use the BuiltinsDb but does
not have access to the C++ library at runtime.

The solution is to add an alternative way to initalize the BuiltinsDb,
through a Loader class.

Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
---
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
1 file changed, 15 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/93/17093/2
--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 2
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17107 )

Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..


Patch Set 2: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/6910/


--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:09:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8198/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 23 Feb 2021 02:07:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17088 )

Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17088/2/be/src/exprs/aggregate-functions.h
File be/src/exprs/aggregate-functions.h:

http://gerrit.cloudera.org:8080/#/c/17088/2/be/src/exprs/aggregate-functions.h@271
PS2, Line 271:   static void DsThetaIntersectUpdate(FunctionContext*, const 
StringVal& src, StringVal* dst);
line too long (93 > 90)


http://gerrit.cloudera.org:8080/#/c/17088/2/be/src/exprs/aggregate-functions.h@273
PS2, Line 273:   static void DsThetaIntersectMerge(FunctionContext*, const 
StringVal& src, StringVal* dst);
line too long (92 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 23 Feb 2021 01:47:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10520: Implement ds theta intersect() function

2021-02-22 Thread Fucun Chu (Code Review)
Fucun Chu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17088


Change subject: IMPALA-10520: Implement ds_theta_intersect() function
..

IMPALA-10520: Implement ds_theta_intersect() function

This function receives a set of serialized Apache DataSketches Theta
sketches produced by ds_theta_sketch() and intersects them into a
single sketch.

An example usage is to create a sketch for each partition of a table,
write these sketches to a separate table and intersect them to get an
estimates based on the partitions the user is interested in related
sketches. E.g.:
  SELECT
  ds_theta_estimate(ds_theta_intersect(sketch_col))
  FROM sketch_tbl
  WHERE partition_col=1 OR partition_col=5;

Testing:
  - Apart from the automated tests I added to this patch I also
tested ds_theta_intersect() on a bigger dataset to check that
serialization, deserialization and merging steps work well. I
took TPCH25.linelitem, created a number of sketches with grouping
by l_shipdate and called ds_theta_intersect() on those sketches

Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
---
M be/src/exprs/aggregate-functions-ir.cc
M be/src/exprs/aggregate-functions.h
M fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 161 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/88/17088/2
--
To view, visit http://gerrit.cloudera.org:8080/17088
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I80e68c2151c4604f0386d3dfb004c82b10293f97
Gerrit-Change-Number: 17088
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-10533: Fix TestScratchDir.test scratch dirs mix local and remote dir spill local only seems flaky

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17102 )

Change subject: IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
Gerrit-Change-Number: 17102
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Tue, 23 Feb 2021 01:40:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 23 Feb 2021 01:24:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10510: Change code to help with third party extensions

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17079 )

Change subject: IMPALA-10510: Change code to help with third party extensions
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6912/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Gerrit-Change-Number: 17079
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 22 Feb 2021 22:51:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10510: Change code to help with third party extensions

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17079 )

Change subject: IMPALA-10510: Change code to help with third party extensions
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Gerrit-Change-Number: 17079
Gerrit-PatchSet: 4
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 22 Feb 2021 22:51:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10510: Change code to help with third party extensions

2021-02-22 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17079 )

Change subject: IMPALA-10510: Change code to help with third party extensions
..


Patch Set 3: Code-Review+2

These changes look okay to me.


--
To view, visit http://gerrit.cloudera.org:8080/17079
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4911ddef232301c99ceada4635b72bf4e57ea9c7
Gerrit-Change-Number: 17079
Gerrit-PatchSet: 3
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 22 Feb 2021 22:51:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Vihang Karajgaonkar (Code Review)
Vihang Karajgaonkar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3187
PS1, Line 3187: long txnId = MetastoreShim.openTransaction(msClient);
> Table metadata operations are not handled in a transactional way AFAIK, i.e
I think tools like Hive replication depend on the fact that these events happen 
in the correct order.



--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 22:49:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17095 )

Change subject: IMPALA-9767: Do not clean up filter while PublishFilter is 
ongoing
..


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6911/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/17095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
Gerrit-Change-Number: 17095
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 22:07:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:38:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17107 )

Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6910/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:27:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17107 )

Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 2
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:27:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Attila Jeges (Code Review)
Attila Jeges has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17107 )

Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Attila Jeges 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:18:41 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17107 )

Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8197/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:14:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10533: Fix TestScratchDir.test scratch dirs mix local and remote dir spill local only seems flaky

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17102 )

Change subject: IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky
..


Patch Set 1:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6909/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
Gerrit-Change-Number: 17102
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:01:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10533: Fix TestScratchDir.test scratch dirs mix local and remote dir spill local only seems flaky

2021-02-22 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17102 )

Change subject: IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/17102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
Gerrit-Change-Number: 17102
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 22 Feb 2021 20:01:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9218: Add support for locally compiled Hive

2021-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17094 )

Change subject: IMPALA-9218: Add support for locally compiled Hive
..


Patch Set 2:

(3 comments)

This makes sense to me. Can you add the new HIVE overrides to our 
README-build.md?
https://github.com/apache/impala/blob/master/README-build.md

The other thing we may want to mention is that the HIVE_VERSION_OVERRIDE does 
need to be 3.* at the moment due to how we use IMPALA_HIVE_MAJOR_VERSION 
(though supporting 4.* will happen in the future).

http://gerrit.cloudera.org:8080/#/c/17094/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17094/2//COMMIT_MSG@23
PS2, Line 23: export 
HIVE_HOME_OVERRIDE=~/hive/packaging/target/apache-hive-3.1.3000.7.1.1.0-SNAPSHOT-bin/apache-hive-3.1.3000.7.1.1.0-SNAPSHOT-bin
> Not sure what the best choice here is -
If the long lines were for the description of what the patch is doing, that 
would bother me, but this is for a code example. Being a bit long here doesn't 
bother me much.

That said, I don't think very many people would copy-paste this, so if you want 
to break up the lines, you can do line breaks wherever is easiest. Another 
option is to use a shorter version number. In other words, 3.1.0-SNAPSHOT 
rather than 3.1.0.3000.7.1.1.0-SNAPSHOT.


http://gerrit.cloudera.org:8080/#/c/17094/2/bin/bootstrap_toolchain.py
File bin/bootstrap_toolchain.py:

http://gerrit.cloudera.org:8080/#/c/17094/2/bin/bootstrap_toolchain.py@469
PS2, Line 469:   use_override_hive = "HIVE_VERSION_OVERRIDE" in os.environ \
 :   and os.environ["HIVE_VERSION_OVERRIDE"] != ""
Nit: for this, I think I would break the line right after the first equals, 
like:

use_override_hive = \
"HIVE_VERSION_OVERRIDE" in os.environ and 
os.environ["HIVE_VERSION_OVERRIDE"] != ""

I think that still fits in 90 characters.


http://gerrit.cloudera.org:8080/#/c/17094/2/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/17094/2/bin/impala-config.sh@254
PS2, Line 254: export 
IMPALA_HIVE_STORAGE_API_VERSION=${HIVE_STORAGE_API_VERSION_OVERRIDE:-"2.3.0.$IMPALA_HIVE_VERSION"}
> I've tried to use good taste in the line length here. I know there is a rec
The indentation here doesn't bother me. Shell scripts make this hard.



--
To view, visit http://gerrit.cloudera.org:8080/17094
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I21892c153c445e3a5d93f2bc8f5e0b799929dd34
Gerrit-Change-Number: 17094
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 22 Feb 2021 19:58:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10536: Fix saml2 callback token ttl's description

2021-02-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17107


Change subject: IMPALA-10536: Fix saml2_callback_token_ttl's description
..

IMPALA-10536: Fix saml2_callback_token_ttl's description

The unit in the  description of saml2_callback_token_ttl was
"seconds", while its value is interpreted as milliseconds and
the default 30 was way too low. Changing the description to
mention milliseconds.

Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
---
M be/src/util/backend-gflag-util.cc
M tests/custom_cluster/test_saml2_sso.py
2 files changed, 2 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/07/17107/1
--
To view, visit http://gerrit.cloudera.org:8080/17107
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib1057f0c5694883d1b1e14075876c780d6c942a8
Gerrit-Change-Number: 17107
Gerrit-PatchSet: 1
Gerrit-Owner: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 2: Code-Review+2

(1 comment)

This can go in from my side.

http://gerrit.cloudera.org:8080/#/c/17081/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17081/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3194
PS2, Line 3194:   LOG.error(String.format(
  :   "Exception caught during increasing write id for 
table %s: %s",
  :   tbl.getFullName(), ex));
  :   if (txnId != -1) {
  : try {
  :   MetastoreShim.abortTransaction(msClient, txnId);
  : } catch (TransactionException abortEx) {
  :   LOG.error(String.format(
  :   "Could not abort transaction: %d. This 
transaction was opened to " +
  :   "increase the write id of table %s during ADD 
PARTITION. Exception: %s",
  :   txnId, tbl.getFullName(), abortEx));
  : }
  :   }
nice to habve: there could be a function for this, e.g. 
cleanupFailedTransaction(long txnId, String operation, Exception ex)



--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 19:51:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6908/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 22 Feb 2021 19:42:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8196/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 22 Feb 2021 19:30:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc@171
PS1, Line 171: if
> formatting (needs a space after 'if')
Done


http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc@1954
PS1, Line 1954:
> The formatting here was correct before.
Done



--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 22 Feb 2021 19:12:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..

IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan 
build

Fixed a data race issue when running testcase
DiskIoMgrTest.WriteToRemotePartialFileSuccess in the tsan build. The
cause is the TmpFileBufferPool is not released gracefully while
destruction.

Tests:
Reran all testcases of DiskIoMgrTest with tsan and asan build.

Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
---
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr.cc
2 files changed, 19 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/17100/2
--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 


[Impala-ASF-CR] IMPALA-10504: Add tracing for remote block reads

2021-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17062 )

Change subject: IMPALA-10504: Add tracing for remote block reads
..


Patch Set 5:

This is looking good to me. Please wrap the long lines, then I'll bump to +2.


--
To view, visit http://gerrit.cloudera.org:8080/17062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c6a3e92f44813048022edf2b91299b3b0a20257
Gerrit-Change-Number: 17062
Gerrit-PatchSet: 5
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 22 Feb 2021 18:54:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10504: Add tracing for remote block reads

2021-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17062 )

Change subject: IMPALA-10504: Add tracing for remote block reads
..


Patch Set 5: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/17062
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8c6a3e92f44813048022edf2b91299b3b0a20257
Gerrit-Change-Number: 17062
Gerrit-PatchSet: 5
Gerrit-Owner: Kurt Deschler 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Mon, 22 Feb 2021 18:54:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

2021-02-22 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17095 )

Change subject: IMPALA-9767: Do not clean up filter while PublishFilter is 
ongoing
..


Patch Set 3: Code-Review+2

This looks good to me. Thanks for taking this on!


--
To view, visit http://gerrit.cloudera.org:8080/17095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
Gerrit-Change-Number: 17095
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 18:47:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Thomas Tauber-Marshall (Code Review)
Thomas Tauber-Marshall has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc
File be/src/runtime/tmp-file-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc@171
PS1, Line 171: if(
formatting (needs a space after 'if')


http://gerrit.cloudera.org:8080/#/c/17100/1/be/src/runtime/tmp-file-mgr.cc@1954
PS1, Line 1954: /
The formatting here was correct before.

If you're not already aware, there's a tool called clang-format-diff that you 
can use to do the formatting automatically, see: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65868536



--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Tauber-Marshall 
Gerrit-Comment-Date: Mon, 22 Feb 2021 18:19:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 1:

(1 comment)

Sounds like a good idea to have a Loader and use it to verify that the content 
from the 3rd party tool is exactly the same as what is expected by Impala FE 
and all the way to BE.

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@91
PS1, Line 91: if (initBuiltins) {
:   initBuiltins();
: }
> See general comment for perhaps a better way to implement.
Perhaps that a verifier should be coded to make sure that the 3rd party tools 
produce exact the same items as FE currently does.



--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 22 Feb 2021 17:49:51 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10371: test java udfs crash impalad if result spooling is enabled

2021-02-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17106 )

Change subject: IMPALA-10371: test_java_udfs crash impalad if result spooling 
is enabled
..


Patch Set 2: Code-Review+1

lgtm, will upgrade to +2 once the sample is updated


--
To view, visit http://gerrit.cloudera.org:8080/17106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3aadb8ccc0f1f9b7b87a5744c22a0555b325ee6
Gerrit-Change-Number: 17106
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 17:34:49 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10371: test java udfs crash impalad if result spooling is enabled

2021-02-22 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17106 )

Change subject: IMPALA-10371: test_java_udfs crash impalad if result spooling 
is enabled
..


Patch Set 2: Code-Review+1

Hi Daniel,
The patch works for me. Test it myself against the 3 failed tests and all pass 
with this patch.
Please carry my +1 after revising the Sample IR documentation.


--
To view, visit http://gerrit.cloudera.org:8080/17106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3aadb8ccc0f1f9b7b87a5744c22a0555b325ee6
Gerrit-Change-Number: 17106
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 17:32:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10371: test java udfs crash impalad if result spooling is enabled

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17106 )

Change subject: IMPALA-10371: test_java_udfs crash impalad if result spooling 
is enabled
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8195/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie3aadb8ccc0f1f9b7b87a5744c22a0555b325ee6
Gerrit-Change-Number: 17106
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 17:28:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5675: Support UTF-8 Varchar and Char types

2021-02-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16909 )

Change subject: IMPALA-5675: Support UTF-8 Varchar and Char types
..


Patch Set 14:

(26 comments)

Started the review maybe 1/3 way through. Will continue tomorrow.

Overall, it looks good to me.

One major concern is the use of utf8 mode query option. It may be OK for debug 
purpose. In practice, however it may be better to rely on table property (utf8 
or non utf8) to make the whole thing automatic.

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/exprs/anyval-util.h
File be/src/exprs/anyval-util.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/exprs/anyval-util.h@310
PS14, Line 310: sv->ptr = const_cast(reinterpret_cast(slot));
  : sv->len = (type.is_utf8 && type.type == TYPE_CHAR)?
  : FindUtf8Pos(reinterpret_cast(slot), 
type.len * 4, type.len)
  : : type.len;
Maybe separate the two cases to make the code cleaner.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/exprs/slot-ref.cc
File be/src/exprs/slot-ref.cc:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/exprs/slot-ref.cc@249
PS14, Line 249: type_.type == TYPE_FIXED_UDA_INTERMEDIATE
I thought TYPE_FIXED_UDA_INTERMEDIATE is utf8 irrelevant, no?


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/exprs/slot-ref.cc@393
PS14, Line 393: )
I thought TYPE_FIXED_UDA_INTERMEDIATE is utf8 irrelevant, no?


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/string-value.h
File be/src/runtime/string-value.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/string-value.h@45
PS14, Line 45:   /// The length of the string in bytes.
May append "regardless whether it is binary or UTF8".


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.h
File be/src/runtime/types.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.h@73
PS14, Line 73: /// It's the logical length of the type. E.g. CHAR(3) has len=3 
regardless of whether
 :   /// we are in UTF-8 mode
Elsewhere in the code, it seems 'len' is interpreted as length in bytes for 
VARCHAR. For example, GetMaxStrLen() is defined only for CHAR.

Suggest to distinguish between the length in characters and the length in 
bytes. For example, use 'len' for length in characters, and add a new data 
member 'len_in_bytes' to facilitate computation with char/varchar/string types.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.h@119
PS14, Line 119: int len
Add a comment on the unit of 'len'.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.h@129
PS14, Line 129: int len,
Add a comment on the unit of 'len'.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.h@297
PS14, Line 297: GetMaxStrLen
It will be good if we pay the cost of storing the length in bytes once 
(computing it once in the cstr) to avoid the computation many times.

The method could be named as GetStorageLen() or GetStorageSize().


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.cc
File be/src/runtime/types.cc:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.cc@52
PS14, Line 52: type == TYPE_FIXED_UDA_INTERMEDIATE
Consider make a special case for TYPE_FIXED_UDA_INTEMEDIATE which cares only 
the len.

For CHAR, VARCHAR and STRING, do both len and utf8.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/runtime/types.cc@333
PS14, Line 333: STRING UTF8
Probably should be STRING(UTF8).


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/udf/udf-internal.h
File be/src/udf/udf-internal.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/udf/udf-internal.h@148
PS14, Line 148: ,
May add a comment "to return storage size".


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/udf/udf.h
File be/src/udf/udf.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/udf/udf.h@106
PS14, Line 106: int
Maybe indicate it is the length in characters.

Ideally, we can make use of a LENGTH_TYPE (instead of int) here.

struct LENGTH_TYPE {
  int char_len;
  int storage_len;
}


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/util/bit-util.h
File be/src/util/bit-util.h:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/util/bit-util.h@133
PS14, Line 133: >
Should be ">=" as 0x00 is a single byte UTF8 character.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/util/bit-util.h@138
PS14, Line 138: }
May add a DCHECK(false) here.


http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/util/string-util.cc
File be/src/util/string-util.cc:

http://gerrit.cloudera.org:8080/#/c/16909/14/be/src/util/string-util.cc@96
PS14, Line 96:  if (utf8_cnt != nullptr) *utf8_cnt = 0;
 :   if (len == 0) return 0;
These two lines can be removed.


http://gerrit.cloudera.org:8080/#/c/16909/14/common/thrift/Types.thrift
File common/thrift/Types.thrift:


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Steve Carlin (Code Review)
Steve Carlin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 1:

(3 comments)

For the most part, the BuiltIn instance will have the exact same contents. In 
my implementation though, I created a static file containing all the dirty 
details, like the C++ signatures.  After the Builtin instance is created, the 
"BuiltInDb.getInstance().addFunction is called for all the built in functions.

I went with this implementation because I was looking for minimal impact on the 
Impala code base.  I could go for something a bit more complicated if this 
doesn't meet code review requirements.

If we're concerned about immutability, one potential solution is to have a 
Loader interface that loads in the BuiltinDb the way an external user of the 
class prefers.  I think it would have to be a singleton as well and look like 
this:

private static volatile Loader loader;
public static setLoader(Loader loader) {
  this.loader = loader;
}

...and then in the Builtins constructor, call loader.initBuiltins(this);

The Impala version would have a BuiltinLoader class with the "initBuiltins()" 
method that contains all the static initializations.

Would that be more to your liking?

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@81
PS1, Line 81:   public static synchronized Db getInstance(boolean initBuiltins) 
{
> Should we make 'initBuiltins' a static field so if someone call both getIns
See general comment for perhaps a better way to implement.

But if I stick with this, the thing is that in my implementation, it's really 
only slightly deferred.  The parameter only matters when the singleton hasn't 
been initialized.  After that, it doesn't matter if you call it with true or 
false.


http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@91
PS1, Line 91: if (initBuiltins) {
:   initBuiltins();
: }
> This seems disable the init of all builtins. My concern is the following:
See general comment for perhaps a better way to implement.

But to address these comments:

The third party tool is responsible for all the initializations.  There's no 
mix.  And the 3rd party should match the functions that Impala uses to populate.

There are future plans to make sure that the Impala initialization matches the 
3rd party initialization, but that is a bigger rewrite, and beyond the scope of 
what can be developed at this point.


http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@99
PS1, Line 99: private
> initBuiltins() is private. Will it be called after we init the singleton wi
See general comment for perhaps a better way to implement.

But to address this comment:

The third part is using BuiltinDb.addFunction() to populate the functions.

I can address the rename, will do so after discussing the other proposal I 
mentioned on the top level.



--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Mon, 22 Feb 2021 17:08:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10371: test java udfs crash impalad if result spooling is enabled

2021-02-22 Thread Daniel Becker (Code Review)
Daniel Becker has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17106


Change subject: IMPALA-10371: test_java_udfs crash impalad if result spooling 
is enabled
..

IMPALA-10371: test_java_udfs crash impalad if result spooling is enabled

IMPALA-7658 introduced proper codegen for HiveUdfCall. Because of a bug
in LLVM (see https://bugs.llvm.org/show_bug.cgi?id=21431), codegen code
could not use JniUtil::GetJNIEnv directly as it involves thread-local
variables, which LLVM JIT does not (yet) support. The original solution
to get around this problem was to cache the JNIEnv pointer in the
FunctionContext but it turned out that this leads to a crash if result
spooling is enabled because multiple threads can end up using the same
JNIEnv object.

This commit fixes the problem by using a different solution: instead of
caching the JNIEnv pointer we use a wrapper function
(HiveUdfCall:GetJniEnvNotInlined) that prevents JniUtil::GetJNIEnv from
being inlined in codegen code, thereby ensuring that the handling of the
thread-local variable is compiled by GCC.

Testing:
  Manually verified that TestUdfExecution::test_java_udfs passes with
  SPOOL_QUERY_RESULTS enabled.

Change-Id: Ie3aadb8ccc0f1f9b7b87a5744c22a0555b325ee6
---
M be/src/exprs/hive-udf-call-ir.cc
M be/src/exprs/hive-udf-call.cc
M be/src/exprs/hive-udf-call.h
3 files changed, 87 insertions(+), 97 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/06/17106/2
--
To view, visit http://gerrit.cloudera.org:8080/17106
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie3aadb8ccc0f1f9b7b87a5744c22a0555b325ee6
Gerrit-Change-Number: 17106
Gerrit-PatchSet: 2
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

2021-02-22 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17095 )

Change subject: IMPALA-9767: Do not clean up filter while PublishFilter is 
ongoing
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17095/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17095/2//COMMIT_MSG@10
PS2, Line 10: the memory
: consumed by
> nit: Maybe it would be a bit clearer if we change it to the following.
Done



--
To view, visit http://gerrit.cloudera.org:8080/17095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
Gerrit-Change-Number: 17095
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Mon, 22 Feb 2021 16:39:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

2021-02-22 Thread Riza Suminto (Code Review)
Hello Fang-Yu Rao, Joe McDonnell, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17095

to look at the new patch set (#3).

Change subject: IMPALA-9767: Do not clean up filter while PublishFilter is 
ongoing
..

IMPALA-9767: Do not clean up filter while PublishFilter is ongoing

There have been occurrences of heap-use-after-free in ASAN build during
runtime filter publishing. This issue happens because the memory
consumed by an aggregated Bloom filter has been released while the
coordinator is still sending the aggregated filter via KRPC to workers.
This patch removes the offending cleanup routine. This patch also
decouples the cleanup routine from FilterState::DisableAndRelease() into
a separate method FilterState::Release() and asserts that no RPC is
inflight while cleaning up the filter.

Testing:
- Reproduce the bug by instrumenting Coordinator::UpdateFilter().
- Manually verify that the bug does not happen anymore after the patch.
- Pass core tests in ASAN build.

Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-filter-state.h
M be/src/runtime/coordinator.cc
3 files changed, 18 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/95/17095/3
--
To view, visit http://gerrit.cloudera.org:8080/17095
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I1c408bdedab83c4b9249e2c0c493cb0f894a3d08
Gerrit-Change-Number: 17095
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..


Patch Set 4: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 15:54:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..

IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_

We had a DCHECK in ScalarColumnReader::MaterializeValueBatch() that
checked that 'num_buffered_values_' is greater or equal to the
number of cached values in the Parquet definition level decoder.

In SkipTopLevelRows() we used decoder.ReadLevel() which loaded
the cache of the decoder with probably more values than the
actual value count. It is because literal runs are stored in groups
of 8, i.e. there might be padding zeros at the end.

Alternatively we can fill the cache of the decoder with
CacheNextBatch(num_vals). In this case we won't load more values
than the actual value count.

Testing
 * until this patch TestParquetStats::test_page_index was flaky
   because of this issue
 * I tested the solution on a hacked Impala that randomly generated
   skip ranges

Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Reviewed-on: http://gerrit.cloudera.org:8080/17071
Reviewed-by: Zoltan Borok-Nagy 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-level-decoder.h
2 files changed, 18 insertions(+), 3 deletions(-)

Approvals:
  Zoltan Borok-Nagy: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17100 )

Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8193/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 14:51:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix failed in sanitizer builds

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17096 )

Change subject: IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix 
failed in sanitizer builds
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8194/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic77dd85d20efc7066758b2ba61e2138745cbd90b
Gerrit-Change-Number: 17096
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 14:50:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10531: Fix TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17101 )

Change subject: IMPALA-10531: Fix 
TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in 
exhaustive release build
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8192/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7320ae41957c44151b7ec17886c383e72c2546e4
Gerrit-Change-Number: 17101
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 14:45:29 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix failed in sanitizer builds

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17096


Change subject: IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix 
failed in sanitizer builds
..

IMPALA-10526: Fix BufferPoolTest.Multi8RandomSpillToRemoteMix failed in 
sanitizer builds

Fixed a data race issue when running testcase
BufferPoolTest.Multi8RandomSpillToRemoteMix in the tsan build. The
solution is to access ScanRange::use_local_buffer_ under
ScanRange::lock_.

Tests:
Rerun testcase BufferPoolTest.Multi8RandomSpillToRemoteMix.

Change-Id: Ic77dd85d20efc7066758b2ba61e2138745cbd90b
---
M be/src/runtime/io/request-ranges.h
M be/src/runtime/io/scan-range.cc
2 files changed, 26 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/96/17096/1
--
To view, visit http://gerrit.cloudera.org:8080/17096
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic77dd85d20efc7066758b2ba61e2138745cbd90b
Gerrit-Change-Number: 17096
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan build

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17100


Change subject: IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess 
failed in tsan build
..

IMPALA-10527: Fix DiskIoMgrTest.WriteToRemotePartialFileSuccess failed in tsan 
build

Fixed a data race issue when running testcase
DiskIoMgrTest.WriteToRemotePartialFileSuccess in the tsan build. The
cause is the TmpFileBufferPool is not released gracefully while
destruction.

Tests:
Reran all testcases of DiskIoMgrTest with tsan and asan build.

Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
---
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr.cc
2 files changed, 20 insertions(+), 9 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/17100/1
--
To view, visit http://gerrit.cloudera.org:8080/17100
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I6b0435bf0ee580acb5553527c0ed4f3aa806707f
Gerrit-Change-Number: 17100
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-10531: Fix TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in exhaustive release build

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17101


Change subject: IMPALA-10531: Fix 
TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in 
exhaustive release build
..

IMPALA-10531: Fix 
TmpFileMgrTest.TestCompressBufferManagementEncryptedRemoteUpload failed in 
exhaustive release build

Fixed TmpFileMgrTest testcases failed in exhaustive release build.
The cause is that in TmpFileGroup::AllocateRemoteSpace(), it uses
DCHECK to check the return of TmpFileRemote::AllocateSpace(), because
it always returns true, in the release build, the logic is optimized,
so TmpFileRemote::AllocateSpace() isn't called in the release build.
The solution is to use if instead of DCHECK.

Tests:
Reran TmpFileMgrTest/DiskIoMgrTest/BufferPoolTest in the release build.

Change-Id: I7320ae41957c44151b7ec17886c383e72c2546e4
---
M be/src/runtime/tmp-file-mgr.cc
1 file changed, 12 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/01/17101/1
--
To view, visit http://gerrit.cloudera.org:8080/17101
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7320ae41957c44151b7ec17886c383e72c2546e4
Gerrit-Change-Number: 17101
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-10533: Fix TestScratchDir.test scratch dirs mix local and remote dir spill local only seems flaky

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17102 )

Change subject: IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8191/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
Gerrit-Change-Number: 17102
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 22 Feb 2021 14:09:22 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10525: Add param to BuiltinsDb to defer initialization

2021-02-22 Thread Qifan Chen (Code Review)
Qifan Chen has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17093 )

Change subject: IMPALA-10525: Add param to BuiltinsDb to defer initialization
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java
File fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java:

http://gerrit.cloudera.org:8080/#/c/17093/1/fe/src/main/java/org/apache/impala/catalog/BuiltinsDb.java@91
PS1, Line 91: if (initBuiltins) {
:   initBuiltins();
: }
This seems disable the init of all builtins. My concern is the following:
1. The 3rd party tools only provide some coverage in functionality, and relies 
on Impala's for the rest. If so, a reduced scope in bypassing the 
initBuiltins(this) calls would be better.
2. FE uses the builtIn functions to process a query, such as one containing 
NDV() function. So disabling these builtins entirely have an impact. An 
alternative approach would be to just disable the loading of the C++ signature 
part, which makes it more inline with the requirement to manage 3rd party tools 
that do not have the C++ implementations.



--
To view, visit http://gerrit.cloudera.org:8080/17093
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1941a2494efe15b63514d849873eeb1c4ed8a981
Gerrit-Change-Number: 17093
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 22 Feb 2021 14:00:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10533: Fix TestScratchDir.test scratch dirs mix local and remote dir spill local only seems flaky

2021-02-22 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/17102


Change subject: IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky
..

IMPALA-10533: Fix 
TestScratchDir.test_scratch_dirs_mix_local_and_remote_dir_spill_local_only 
seems flaky

The E2E testcase emulates the situation when there are two types of
scratch directories, the data only spills to the local one when the
space of local directory is sufficient. The testcase works fine for
the debug build, however in the release build, the system runs faster
and more data is spilled from memory which exceeds the setting of the
local scratch space limit. To solve this, the size limit of local
scratch space is changed from 100M to 2GB, so that allows all of the
spilled data is in the local instead of the remote directory.

Tests:
Reran test_scratch_dirs_mix_local_and_remote_dir_spill_local_only in
the release build.

Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/17102/1
--
To view, visit http://gerrit.cloudera.org:8080/17102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If2dc32196b2554aee9fc94a4ccbbf5803dbcce1d
Gerrit-Change-Number: 17102
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8190/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 11:40:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3188
PS1, Line 3188:
> No, I don't think that we should do an UNDO, but it would be probably bette
Added logic to abort the txn.



--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 11:23:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Vihang Karajgaonkar, Gabor Kaszab, Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17081

to look at the new patch set (#2).

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..

IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

ALTER TABLE ADD PARTITION should bump the write id for ACID tables.
Both for INSERT-only and full ACID tables.

Since table metadata changing is independent of ACID transactions,
we are doing it in a best-effort way. I.e. if the ADD PARTITION
operation was successful, then we try to bump the write id of the
table. If bumping the write id fails we swallow the exception and
log the error. This might cause minor glitches for engines that
cache table metadata and update it based on write id change. A
symptom is that SHOW PARTITIONS might not display the newly added empty
partition. Query results should not be affected.

Testing:
 * added e2e test

Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
---
M fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
M tests/query_test/test_acid.py
2 files changed, 60 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/81/17081/2
--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..


Patch Set 4: Code-Review+2

Carry +2


--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 11:09:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10505: Avoid creating misleading audit logs

2021-02-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17078 )

Change subject: IMPALA-10505: Avoid creating misleading audit logs
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@172
PS2, Line 172: "alltypes_view", TPrivilegeLevel.SELECT));
Could you add more tests? E.g. what are the audit events in these cases:
case1: user has privilege to read the view and all underlying tables
case2: user has privilege to read the view and some of the underlying tables.



--
To view, visit http://gerrit.cloudera.org:8080/17078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f40eb96d6ed863cd2cd88d717c354dc351a64c
Gerrit-Change-Number: 17078
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 22 Feb 2021 10:44:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..


Patch Set 3:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8189/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 10:23:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/6907/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 10:09:53 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17071 )

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..


Patch Set 3:

(1 comment)

Yeah, unfortunately with Hive3 it's harder generate custom Parquet files. But 
at least we have test_page_index, which shouldn't be flaky anymore (al least 
not because of this issue).

http://gerrit.cloudera.org:8080/#/c/17071/2/be/src/exec/parquet/parquet-level-decoder.h
File be/src/exec/parquet/parquet-level-decoder.h:

http://gerrit.cloudera.org:8080/#/c/17071/2/be/src/exec/parquet/parquet-level-decoder.h@88
PS2, Line 88: LIKELY(Cache
> LIKELY could be added
Done



--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 10:07:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: def levels .CacheRemaining() <= num buffered values

2021-02-22 Thread Zoltan Borok-Nagy (Code Review)
Hello Csaba Ringhofer, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/17071

to look at the new patch set (#3).

Change subject: IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_
..

IMPALA-10501: Hit DCHECK in parquet-column-readers.cc: 
def_levels_.CacheRemaining() <= num_buffered_values_

We had a DCHECK in ScalarColumnReader::MaterializeValueBatch() that
checked that 'num_buffered_values_' is greater or equal to the
number of cached values in the Parquet definition level decoder.

In SkipTopLevelRows() we used decoder.ReadLevel() which loaded
the cache of the decoder with probably more values than the
actual value count. It is because literal runs are stored in groups
of 8, i.e. there might be padding zeros at the end.

Alternatively we can fill the cache of the decoder with
CacheNextBatch(num_vals). In this case we won't load more values
than the actual value count.

Testing
 * until this patch TestParquetStats::test_page_index was flaky
   because of this issue
 * I tested the solution on a hacked Impala that randomly generated
   skip ranges

Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
---
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-level-decoder.h
2 files changed, 18 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/71/17071/3
--
To view, visit http://gerrit.cloudera.org:8080/17071
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic071473e7b315300fd5e163225d3e39735f09c4f
Gerrit-Change-Number: 17071
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-10505: Avoid creating misleading audit logs

2021-02-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17078 )

Change subject: IMPALA-10505: Avoid creating misleading audit logs
..


Patch Set 2:

(4 comments)

The fix looks good to me. Just have few minor comments.

http://gerrit.cloudera.org:8080/#/c/17078/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17078/2//COMMIT_MSG@47
PS2, Line 47: Verified that a user not granted the privilege on the underlying
:table(s) of a view is still not able to access the runtime 
profile or
:execution summary even though the user is granted the 
privilege on
:the view.
Do we have test for this?


http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java
File 
fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java:

http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@473
PS2, Line 473:   tmpAuditHandler.getAuthzEvents().clear();
I have another thought that if we don't need these audits, we don't need to 
generate them at all. What about setting tmpAuditHandler to null when 
retainAudits=false? Then we don't need this if-clause and just need to add 
check of 'tmpAuditHandler != null' at line 475.


http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java@505
PS2, Line 505:   tmpAuditHandler.getAuthzEvents().clear();
Same point as the above comment.


http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java
File 
fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java:

http://gerrit.cloudera.org:8080/#/c/17078/2/fe/src/test/java/org/apache/impala/authorization/ranger/RangerAuditLogTest.java@163
PS2, Line 163:
nit: two blank lines here, please remove one.



--
To view, visit http://gerrit.cloudera.org:8080/17078
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02f40eb96d6ed863cd2cd88d717c354dc351a64c
Gerrit-Change-Number: 17078
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Fang-Yu Rao 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 22 Feb 2021 09:11:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5675: Support UTF-8 Varchar and Char types

2021-02-22 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16909 )

Change subject: IMPALA-5675: Support UTF-8 Varchar and Char types
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/8188/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/16909
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62efa3042c64d1d005a2cf4fd1d31e992543963f
Gerrit-Change-Number: 16909
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Feb 2021 08:40:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5675: Support UTF-8 Varchar and Char types

2021-02-22 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16909 )

Change subject: IMPALA-5675: Support UTF-8 Varchar and Char types
..


Patch Set 14:

(5 comments)

Thanks for your review, Tim! Addressed the comments.

http://gerrit.cloudera.org:8080/#/c/16909/13//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16909/13//COMMIT_MSG@7
PS13, Line 7: IMPALA-5675: Support UTF-8 Varchar and Char types
> One minor thing : I think we should comment the len field, e.g. in StringVa
Done.

In StringValue and StringVal, it's still the length in bytes of the string.

In ScalarType(FE), ColumnType and TypeDesc, it's now the logical length of the 
type. So CHAR(3) will always has len=3 regradless whether we are in UTF-8 mode. 
The difference happens in how we use the len field. In UTF-8 mode, we have 
special logics like allocating 4*N bytes for a CHAR(N) slot.

Added comments in ScalarType.java, ColumnType(types.h), TypeDesc(udf.h) and 
StringValue(string-value.h).


http://gerrit.cloudera.org:8080/#/c/16909/13//COMMIT_MSG@45
PS13, Line 45:
> In the backend, how did you identify all the places you needed to change th
Yeah, this is the difficulty of this work. Thanks for listing out the places. 
They are covered in this patch.

In developing this patch, I checked all places where type.len is used. 
Previously they are treated as length in bytes. Truncating and padding happen 
there. Now in UTF-8 mode, they should be treated as length in characters so 
need code changes.

Hopefully, I didn't miss any places. I'll try to add more tests to improve our 
confidence.


http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exec/kudu-scanner.cc@392
PS13, Line 392: // it is too long.
> Should we add a DCHECK in UTF-8 mode to verify that the strings returned fr
Done


http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exprs/cast-functions-ir.cc
File be/src/exprs/cast-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exprs/cast-functions-ir.cc@218
PS13, Line 218:   // The format string can have non-ASCII characters using 
double quoted nested strings.
> I don't think they could? Custom timestamp formats could have them, but tha
Yes, unfortunately, after some investigation, I found the format can have 
customized strings surrounded with double quotes [1].

When casting TIMESTAMP to CHAR(n), FE will split it into two castings: 
TIMESTAMP -> STRING and STRING -> CHAR(n). (Codes are in CastExpr#analyze()). 
So casting to CHAR(n) is currently correct since STRING to CHAR(n) casting is 
correct.
However, when casting TIMESTAMP to VARCHAR(n), the current result is wrong if 
there are non-ascii characters in the format. Example queries are in [2]. I'll 
fix the varchar issue.

Gerrit can't show non-ascii characters in comments. I uploaded them in 
snapshots:
[1] 
https://drive.google.com/file/d/1sgjSTBgAGCGZU3nQGfOSUBFw35lrvzoq/view?usp=sharing
[2] 
https://drive.google.com/file/d/1cirgYIEnssf2CsoaSpRD1m9pxN78VNua/view?usp=sharing


http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exprs/scalar-expr-evaluator.cc
File be/src/exprs/scalar-expr-evaluator.cc:

http://gerrit.cloudera.org:8080/#/c/16909/13/be/src/exprs/scalar-expr-evaluator.cc@316
PS13, Line 316:   // TODO: consider returning reference of the StringVal in 
UTF-8 mode.
> You might have to be careful here with UDFs, since they could return a len
Yeah, but it's ok for now since we don't support UDFs that returns CHAR or 
VARCHAR:
https://github.com/apache/impala/blob/d271baa33da1a02aa6ffc47b0380dc62239107b4/fe/src/main/java/org/apache/impala/analysis/CreateUdfStmt.java#L75-L80

So only builtin cast functions can go here.



--
To view, visit http://gerrit.cloudera.org:8080/16909
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I62efa3042c64d1d005a2cf4fd1d31e992543963f
Gerrit-Change-Number: 16909
Gerrit-PatchSet: 14
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 22 Feb 2021 08:22:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5675: Support UTF-8 Varchar and Char types

2021-02-22 Thread Quanlong Huang (Code Review)
Hello Tim Armstrong, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/16909

to look at the new patch set (#14).

Change subject: IMPALA-5675: Support UTF-8 Varchar and Char types
..

IMPALA-5675: Support UTF-8 Varchar and Char types

This patch adds support for UTF-8 aware varchar and char types. In
UTF-8 mode, when truncating UTF-8 varchar(N) and char(N) strings,
lengths will be counted by UTF-8 characters instead of bytes. So the
result string will have up to N UTF-8 characters.

The UTF8_MODE query option is first detected in FE when analyzing the
query. A 'is_utf8' label is added in Exprs and SlotDescriptors. They are
used in generating thrift objects and computing the tuple layouts. A
char(N) slot will occupy 4 * N bytes if it's in UTF-8 type, because a
UTF-8 character can be encoded into 1~4 bytes. The slot will store up to
N UTF-8 characters.

There is a gotcha that we should not add the label in Type.java, because
Type instances are shared across the FE. Query compilation reuses the
Type instances from the metadata. If we modify Type instances during
compilation, other queries in non-UTF8 mode will be affected.

However, in BE, we need the type related classes (e.g. ColumnType,
TypeDesc) to carry in the utf8 markers. It's impractical to check the
UTF8_MODE query option everywhere it needs to be. E.g. in
AnyValUtil::SetAnyVal we can't access the query options. So we add the
'is_utf8' marker in TScalarType, ColumnType, TypeDesc to conveniently
distinguish char(N) and varchar(N) types in UTF-8 mode. When generating
thrift objects in FE, Exprs and SlotDescriptors deliver 'is_utf8'
markers to TScalaTypes. They finally landed in ColumnType and TypeDesc
instances.

Given the correct UTF-8 mode checked, we just need to truncate/pad the
char/varchar strings with their length counted by UTF-8 characters.

Since char(N) slots always occupy 4N bytes, when converting char(N) to
other string types, we need to re-calculate the actual length
corresponding to N UTF-8 characters. We can optimize this in later
patches, e.g. store the UTF-8 length in the slot, or deal with UTF-8
char(N) by the same way as varchar(N), i.e. reallocate the string space
and just store the pointer and length in the slot.

Tests:
 - Add tests for reading char(N) and varchar(N) columns in UTF8_MODE.
 - Add truncating/padding tests
 - Kudu only supports Varchar currently. Add special tests for Kudu.
 - Add tests for writing CHAR(N)/VARCHAR(N) in UTF-8 mode.

Change-Id: I62efa3042c64d1d005a2cf4fd1d31e992543963f
---
M be/src/codegen/codegen-anyval.cc
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/llvm-codegen.cc
M be/src/exec/data-source-scan-node.cc
M be/src/exec/grouping-aggregator.cc
M be/src/exec/hdfs-avro-scanner-ir.cc
M be/src/exec/hdfs-avro-scanner-test.cc
M be/src/exec/hdfs-avro-scanner.cc
M be/src/exec/hdfs-avro-scanner.h
M be/src/exec/hdfs-text-table-writer.cc
M be/src/exec/kudu-scanner.cc
M be/src/exec/kudu-table-sink.cc
M be/src/exec/kudu-util.cc
M be/src/exec/kudu-util.h
M be/src/exec/orc-column-readers.cc
M be/src/exec/parquet/hdfs-parquet-table-writer.cc
M be/src/exec/parquet/parquet-column-readers.cc
M be/src/exec/parquet/parquet-column-stats.inline.h
M be/src/exec/parquet/parquet-common.h
M be/src/exec/parquet/parquet-plain-test.cc
M be/src/exec/text-converter.cc
M be/src/exec/text-converter.inline.h
M be/src/exprs/agg-fn-evaluator.cc
M be/src/exprs/anyval-util.cc
M be/src/exprs/anyval-util.h
M be/src/exprs/cast-functions-ir.cc
M be/src/exprs/scalar-expr-evaluator.cc
M be/src/exprs/scalar-fn-call.cc
M be/src/exprs/slot-ref.cc
M be/src/runtime/raw-value-ir.cc
M be/src/runtime/raw-value.cc
M be/src/runtime/raw-value.inline.h
M be/src/runtime/string-value.h
M be/src/runtime/tuple.cc
M be/src/runtime/types.cc
M be/src/runtime/types.h
M be/src/service/fe-support.cc
M be/src/service/hs2-util.cc
M be/src/udf/udf-internal.h
M be/src/udf/udf.cc
M be/src/udf/udf.h
M be/src/util/CMakeLists.txt
M be/src/util/bit-util.h
M be/src/util/dict-encoding.h
M be/src/util/string-util-test.cc
M be/src/util/string-util.cc
M be/src/util/string-util.h
M be/src/util/tuple-row-compare.cc
M common/thrift/Types.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/catalog/ScalarType.java
M fe/src/main/java/org/apache/impala/catalog/Type.java
M fe/src/main/java/org/apache/impala/service/Frontend.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
M 

[Impala-ASF-CR] IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write id for ACID tables

2021-02-22 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17081 )

Change subject: IMPALA-10512: ALTER TABLE ADD PARTITION should bump the write 
id for ACID tables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/17081/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@3188
PS1, Line 3188: allocateTableWriteId
> In that case the new partition will be added, but we don't increase the wri
No, I don't think that we should do an UNDO, but it would be probably better to 
abort the transaction and not rely on timeouting on HMS side.



--
To view, visit http://gerrit.cloudera.org:8080/17081
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad247008b7c206db00516326c1447bd00a9b34bd
Gerrit-Change-Number: 17081
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 22 Feb 2021 08:00:51 +
Gerrit-HasComments: Yes