[Impala-ASF-CR] IMPALA-11629: Support for huawei OBS FileSystem

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19110 )

Change subject: IMPALA-11629: Support for huawei OBS FileSystem
..


Patch Set 11:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12320/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
Gerrit-Change-Number: 19110
Gerrit-PatchSet: 11
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 07 Feb 2023 03:58:01 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11629: Support for huawei OBS FileSystem

2023-02-06 Thread Xiang Yang (Code Review)
Xiang Yang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19110 )

Change subject: IMPALA-11629: Support for huawei OBS FileSystem
..


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19110/10/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/19110/10/tests/common/impala_test_suite.py@1074
PS10, Line 1074: in ['s3', 'isilon', 'local', 'abfs', 'adls', 'gs', 
'cosn', 'ozone', 'obs']:
> I think you want to add obs to this list to address your hbase test failure
Done



--
To view, visit http://gerrit.cloudera.org:8080/19110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
Gerrit-Change-Number: 19110
Gerrit-PatchSet: 11
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 07 Feb 2023 03:49:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11629: Support for huawei OBS FileSystem

2023-02-06 Thread Xiang Yang (Code Review)
Hello Quanlong Huang, Yida Wu, lipeng...@apache.org, Michael Smith, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19110

to look at the new patch set (#11).

Change subject: IMPALA-11629: Support for huawei OBS FileSystem
..

IMPALA-11629: Support for huawei OBS FileSystem

This patch adds support for huawei OBS (Object Storage Service)
FileSystem. The implementation is similar to other remote FileSystems.

New flags for OBS:
- num_obs_io_threads: Number of OBS I/O threads. Defaults to be 16.

Testing:
 - Upload hdfs test data to an OBS bucket. Modify all locations in HMS
   DB to point to the OBS bucket. Remove some hdfs caching params.
   Run CORE tests.

Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
---
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/util/hdfs-util.cc
M be/src/util/hdfs-util.h
M bin/impala-config.sh
M fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
M java/executor-deps/pom.xml
M java/pom.xml
M testdata/bin/create-load-data.sh
M testdata/cluster/node_templates/common/etc/hadoop/conf/core-site.xml.py
M tests/common/impala_test_suite.py
M tests/common/skip.py
M tests/custom_cluster/test_metastore_service.py
M tests/util/filesystem_utils.py
15 files changed, 117 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/19110/11
--
To view, visit http://gerrit.cloudera.org:8080/19110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
Gerrit-Change-Number: 19110
Gerrit-PatchSet: 11
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12319/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Tue, 07 Feb 2023 02:40:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#4).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 332 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/4
--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..


Patch Set 10: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/9028/


--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 10
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Comment-Date: Tue, 07 Feb 2023 01:33:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11745: Add Hive's ESRI geospatial functions as builtins

2023-02-06 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19425 )

Change subject: IMPALA-11745: Add Hive's ESRI geospatial functions as builtins
..


Patch Set 24: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/19425
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0ca02a70b4ba244778c9db6d14df4423072b225
Gerrit-Change-Number: 19425
Gerrit-PatchSet: 24
Gerrit-Owner: Peter Rozsa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Reviewer: Steve Carlin 
Gerrit-Comment-Date: Tue, 07 Feb 2023 00:39:48 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 43:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@211
PS42, Line 211: um processing cost that a frag
> Maybe add a TODO to set a range for this parameter to avoid unexpected beha
Ack


http://gerrit.cloudera.org:8080/#/c/19033/43/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/19033/43/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@273
PS43, Line 273: root.isBlockingNode
This seems to be flawed for case of preaggregation streaming node. The 
implementation of isBlockingNode() in preaggregate streaming node returns 
false. But during execution, when the preaggragation managed to aggregate until 
the end (no rows passed through), CPU costing algorithm should really consider 
it as a blocking node. Doing so will make costing algorithm set higher instance 
count for query fragment right above it (which contains final aggregation node).

I confirmed this by setting DISABLE_STREAMING_PREAGGREGATIONS=1 when running 
TPCDS Q78 and Q79. With streaming preaggregation disabled, the fragment 
containing final aggregation is scheduled with higher instance count.


http://gerrit.cloudera.org:8080/#/c/19033/43/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@692
PS43, Line 692: builder.append("work-size=");
  :   builder.append(getTotalWorkSize());
This is now redundant with fragment's input cardinality and the sizing is based 
on min_processing_cost_per_thread instead of input row count.
This can be removed.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 43
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Tue, 07 Feb 2023 00:32:29 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11895: Need accessor methods for third party extension

2023-02-06 Thread Kurt Deschler (Code Review)
Kurt Deschler has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19469 )

Change subject: IMPALA-11895: Need accessor methods for third party extension
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/19469
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iaea15ef8a96c9509b9cd79df595868fd1db47e83
Gerrit-Change-Number: 19469
Gerrit-PatchSet: 1
Gerrit-Owner: Steve Carlin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Comment-Date: Tue, 07 Feb 2023 00:32:27 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-10804: [DOCS] Document spill to remote storage

2023-02-06 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19472 )

Change subject: IMPALA-10804: [DOCS] Document spill to remote storage
..


Patch Set 1: Code-Review+1

(2 comments)

It looks good. Just minor things.

http://gerrit.cloudera.org:8080/#/c/19472/1/docs/topics/impala_disk_space.xml
File docs/topics/impala_disk_space.xml:

http://gerrit.cloudera.org:8080/#/c/19472/1/docs/topics/impala_disk_space.xml@422
PS1, Line 422: maximum is 256MB
I think the maximum is updated to 512MB in the code, 
https://github.com/apache/impala/blob/40da36414ff4d46b5cdc53f068b1f0a5b28a0f1d/be/src/runtime/tmp-file-mgr.cc#L142.
 But the comment has not been updated in 
https://github.com/apache/impala/blob/40da36414ff4d46b5cdc53f068b1f0a5b28a0f1d/be/src/runtime/tmp-file-mgr.cc#L101.
 Will file another jira for changing the comment.


http://gerrit.cloudera.org:8080/#/c/19472/1/docs/topics/impala_disk_space.xml@450
PS1, Line 450: ip_address
nit. seems we have different spellings for the ip address, ipaddr or 
ip_address, could you please change to use only one, either would be good.



--
To view, visit http://gerrit.cloudera.org:8080/19472
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3efb2ffcc06cdbe69845c6dc4cf03d9f2e3dcabc
Gerrit-Change-Number: 19472
Gerrit-PatchSet: 1
Gerrit-Owner: Shajini Thayasingh 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 06 Feb 2023 23:43:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 43:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@211
PS42, Line 211: um processing cost that a frag
> This is probably subject to tuning. The original intent is to assign "minim
Maybe add a TODO to set a range for this parameter to avoid unexpected behavior.



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 43
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 06 Feb 2023 23:10:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 43:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12318/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 43
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 06 Feb 2023 23:09:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 43:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h@441
PS42, Line 441: TPlanFragment.effective_
> nit: TPlanFragment.effective_instance_count. Otherwise, it's hard to find c
Done


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@206
PS42, Line 206: 1
> These are relative costs, right? "1" is the minimum cost?
Yes, relative cost. 1 is the minimum.


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@207
PS42, Line 207: p
> nit: don't break line here
Done


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@211
PS42, Line 211: um processing cost that a frag
> what's the normal range? why set default value as 100?
This is probably subject to tuning. The original intent is to assign "minimum 
work" per thread, that is 1 million rows per thread/fragment instance. This 
"min_processing_cost_per_thread" is roughly scale linearly towards num rows per 
thread, but with the weight factored in as well (C & M explained in the commit 
message). I did tests against tpcds_3000_parquet (10 nodes, mt_dop=12) and this 
config value seem to work fine for most queries.

Reducing this value will cause planner to schedule more fragment instance, 
while increasing it will reduce fragment instance.


http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@767
PS42, Line 767:  Allow CPU costing algorithm to schedule fragment instance 
count higher
> The comment is different from the meaning of query option
Fixed.


http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@771
PS42, Line 771:  q
> Why the max value is 64?
This follow the scale of MT_DOP.


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47
PS42, Line 47: childProcessingCost_
> Could we move this Precondition.checkState to constructor?
I look the code again and think that multiple_ is redundant with 
numInstanceSupplier_ from the base class.
I refactored the class accordingly and move the Precondition to 
ProcessingCost.setNumInstanceExpected().


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java@68
PS42, Line 68: checkState(proc
> Should we add Preconditions to check if processingCost_ is valid?
Done


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@265
PS42, Line 265: per row.
> per row?
Done


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java@119
PS42, Line 119: e.getCost() : 1
> What's cost range? Is the value 1 the minimum value?
It is between 1 to 10. IMPALA-2805 add these costs at 
https://github.com/apache/impala/blob/40da36414ff4d46b5cdc53f068b1f0a5b28a0f1d/fe/src/main/java/org/apache/impala/analysis/Expr.java#L79-L94



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 43
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 06 Feb 2023 22:56:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded a new patch set (#43) to the change originally 
created by Qifan Chen. ( http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..

IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by establishing an infrastructure to
allow the weighted total amount of data to process to be used as a new
factor in the definition and selection of an executor group. At the
basis of the CPU costing model, we define ProcessingCost as a cost for a
distinct PlanNode / DataSink / PlanFragment to process its input rows
globally across all of its instances. The costing algorithm then tries
to adjust the number of instances for each fragment by considering their
production-consumption ratio and blocking-operator nature between their
plan nodes, and finally then returns a number representing an ideal CPU
core count required for a query to run efficiently. A more detailed
explanation of the CPU costing algorithm can be explained in four steps
below.

I. Compute ProcessingCost for each plan node and data sink.

ProcessingCost of a PlanNode/DataSink is a weighted amount of data
processed by that node/sink. The basic ProcessingCost is computed with a
general formula as follows.

  ProcessingCost is a pair: PC(D, N), where D = I * (C + M)

  where D is the weighted amount of data processed
I is the input cardinality
C is the expression evaluation cost per row.
  Set to total weight of expression evaluation in node/sink.
M is a materialization cost per row.
  Only used by scan and exchange node. Otherwise, 0.
N is the number of instances.
  Default to D / MIN_COST_PER_THREAD (1 million), but
  is fixed for a certain node/sink and adjustable in step III.

In this patch, the weight of each expression evaluation is set to a
constant of 1. A description of the computation for each kind of
PlanNode/DataSink is given below.

01. AggregationNode:
Each AggregateInfo has its C as a sum of grouping expression and
aggregate expression and then assigned a single ProcessingCost
individually. These ProcessingCosts then summed to be the Aggregation
node's ProcessingCost;

02. AnalyticEvalNode:
C is the sum of the evaluation costs for analytic functions;

03. CardinalityCheckNode:
Use the general formula, I = 1;

04. DataSourceScanNode:
Follow the formula from the superclass ScanNode;

05. EmptySetNode:
  I = 0;

06. ExchangeNode:
  M = 1 / num rows per batch.

A modification of the general formula when in broadcast mode:
  D = D * number of receivers;

07. HashJoinNode:
  probe cost = PC(I0 * C(equiJoin predicate),  N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equi-join predicate), N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

08. HbaseScanNode:
Follow the formula from the superclass ScanNode;

09. HdfsScanNode and KuduScanNode:
Follow the formula from the superclass ScanNode with modified N.
N is mt_dop when query option mt_dop >= 1, otherwise
N is the number of nodes * max scan threads;

10. Nested loop join node:
When the right child is not a SingularRowSrcNode:

  probe cost = PC(I0 * C(equiJoin predicate), N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equiJoin predicate), N)

When the right child is a SingularRowSrcNode:

  probe cost = PC(I0, N)
  build cost = PC(I0 * I1, N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

11. ScanNode:
  M = average row size / ROWBATCH_MAX_MEM_USAGE (8MB);

12. SelectNode:
Use the general formula;

13. SingularRowSrcNode:
Since the node is involved once per input in nested loop join, the
contribution of this node is computed in nested loop join;

14. SortNode:
C is the evaluation cost for the sort expression;

15. SubplanNode:
C is 1. I is the multiplication of the cardinality of the left and
the right child;

16. Union node:
C is the cost of result expression evaluation from all non-pass-through
children;

17. Unnest node:
I is the cardinality of the containing SubplanNode and C is 1.

18. DataStreamSink:
  M = 1 / num rows per batch.

19. JoinBuildSink:
ProcessingCost is the build cost of its associated JoinNode.

20. PlanRootSink:
If result spooling is enabled, C is the cost of output expression
evaluation. Otherwise. 

[Impala-ASF-CR] IMPALA-11629: Support for huawei OBS FileSystem

2023-02-06 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19110 )

Change subject: IMPALA-11629: Support for huawei OBS FileSystem
..


Patch Set 10: -Code-Review

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19110/10/tests/common/impala_test_suite.py
File tests/common/impala_test_suite.py:

http://gerrit.cloudera.org:8080/#/c/19110/10/tests/common/impala_test_suite.py@1074
PS10, Line 1074: in ['s3', 'isilon', 'local', 'abfs', 'adls', 'gs', 
'cosn', 'ozone']:
I think you want to add obs to this list to address your hbase test failures.



--
To view, visit http://gerrit.cloudera.org:8080/19110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
Gerrit-Change-Number: 19110
Gerrit-PatchSet: 10
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 06 Feb 2023 21:21:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11900 Test table iceberg partitioned orc has wrong metadata

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19476 )

Change subject: IMPALA-11900 Test table iceberg_partitioned_orc has wrong 
metadata
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19476
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae860f401947092d9fdca802f41dd6de79e0638d
Gerrit-Change-Number: 19476
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 06 Feb 2023 21:16:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..


Patch Set 10:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9028/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 10
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Comment-Date: Mon, 06 Feb 2023 20:34:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..


Patch Set 10: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 10
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Comment-Date: Mon, 06 Feb 2023 20:34:30 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-11899: Add mandatory links to the Impala Apache home page

2023-02-06 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19477 )

Change subject: IMPALA-11899: Add mandatory links to the Impala Apache home page
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19477/1/index.html
File index.html:

http://gerrit.cloudera.org:8080/#/c/19477/1/index.html@169
PS1, Line 169: 
nit: Indentation looks off now that everything's in the  tag.



--
To view, visit http://gerrit.cloudera.org:8080/19477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbb491fe2ec3c15305c3c66f1e8857a98fced10f
Gerrit-Change-Number: 19477
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 Feb 2023 19:41:28 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-06 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 42:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/scheduling/scheduler.h@441
PS42, Line 441: effective_instance_count
nit: TPlanFragment.effective_instance_count. Otherwise, it's hard to find 
context here.


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc
File be/src/util/backend-gflag-util.cc:

http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@206
PS42, Line 206: 1
These are relative costs, right? "1" is the minimum cost?


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@207
PS42, Line 207: "
nit: don't break line here


http://gerrit.cloudera.org:8080/#/c/19033/42/be/src/util/backend-gflag-util.cc@211
PS42, Line 211: min_processing_cost_per_thread
what's the normal range? why set default value as 100?


http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift
File common/thrift/ImpalaService.thrift:

http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@767
PS42, Line 767:  Control whether to display processing cost detail in query 
plan or not.
The comment is different from the meaning of query option


http://gerrit.cloudera.org:8080/#/c/19033/42/common/thrift/ImpalaService.thrift@771
PS42, Line 771: 64
Why the max value is 64?


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
File fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java@47
PS42, Line 47: Preconditions.checkState
Could we move this Precondition.checkState to constructor?


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java
File fe/src/main/java/org/apache/impala/planner/DataSink.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/DataSink.java@68
PS42, Line 68: processingCost_
Should we add Preconditions to check if processingCost_ is valid?


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
File fe/src/main/java/org/apache/impala/planner/ExchangeNode.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/planner/ExchangeNode.java@265
PS42, Line 265: per row batch
per row?


http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java
File fe/src/main/java/org/apache/impala/util/ExprUtil.java:

http://gerrit.cloudera.org:8080/#/c/19033/42/fe/src/main/java/org/apache/impala/util/ExprUtil.java@119
PS42, Line 119: e.getCost() : 1
What's cost range? Is the value 1 the minimum value?



--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 42
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Mon, 06 Feb 2023 19:12:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11629: Support for huawei OBS FileSystem

2023-02-06 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19110 )

Change subject: IMPALA-11629: Support for huawei OBS FileSystem
..


Patch Set 10: Code-Review+1

> Patch Set 10:
>
> (3 comments)
>
> Thanks Michael!
> I run the core test based on patch10 with only specified table-formats:
>
> export BE_TEST=false
> export FE_TEST=false
> export JDBC_TEST=false
> export RUN_TESTS_ARGS="--table_formats=parquet/none,kudu/none"
> time bin/run-all-tests.sh -e core
>
>
> the hdfs ee test log: 
> https://issues.apache.org/jira/secure/attachment/13055151/hdfs_ee_test_patch10.log
> the obs ee test log: 
> https://issues.apache.org/jira/secure/attachment/13055152/obs_ee_test_patch10.log
>
> When test HDFS filesystem, the HBase services are stopped.
> When test OBS filesystem, the HBase and HDFS services are stopped.
> Most of failed cases were failed because of hbase tables.
> Some OBS failed cases can success by running with './bin/impala-py.test ...' 
> manually.

The changes all look right to me now. I'd have to spend some time sifting 
through the test results; there are some oddities like test_hbase_inserts fails 
with OBS but is skipped with HDFS.


--
To view, visit http://gerrit.cloudera.org:8080/19110
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I84a54dbebcc5b71e9bcdd141dae9e95104d98cb1
Gerrit-Change-Number: 19110
Gerrit-PatchSet: 10
Gerrit-Owner: Xiang Yang 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Xiang Yang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 06 Feb 2023 17:15:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 9
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Comment-Date: Mon, 06 Feb 2023 17:11:27 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-11899: Add mandatory links to the Impala Apache home page

2023-02-06 Thread Jim Apple (Code Review)
Jim Apple has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19477 )

Change subject: IMPALA-11899: Add mandatory links to the Impala Apache home page
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbb491fe2ec3c15305c3c66f1e8857a98fced10f
Gerrit-Change-Number: 19477
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jim Apple 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 06 Feb 2023 17:10:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..


Patch Set 9:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12317/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 9
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Comment-Date: Mon, 06 Feb 2023 16:56:39 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-11899: Add mandatory links to the Impala Apache home page

2023-02-06 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19477 )

Change subject: IMPALA-11899: Add mandatory links to the Impala Apache home page
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/19477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibbb491fe2ec3c15305c3c66f1e8857a98fced10f
Gerrit-Change-Number: 19477
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 16:53:45 +
Gerrit-HasComments: No


[Impala-ASF-CR](asf-site) IMPALA-11899: Add mandatory links to the Impala Apache home page

2023-02-06 Thread Laszlo Gaal (Code Review)
Laszlo Gaal has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19477


Change subject: IMPALA-11899: Add mandatory links to the Impala Apache home page
..

IMPALA-11899: Add mandatory links to the Impala Apache home page

The Apache Foundation has a number of requirements for TLP websites,
which are checked regularly, with reports being published
whimsy.apache.org.

Impala failed in two categories: missing a copyright notice, and missing
a link to the Apache Privacy Policy.

This change to the Impala home page adds both.

It also fixes a misindented source line, and wraps the whole footer into
the  tag originally reserved for the events box.

Tested by loading the local copy of the page into Chrome and Impala on
my system.

Change-Id: Ibbb491fe2ec3c15305c3c66f1e8857a98fced10f
---
M index.html
1 file changed, 7 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/77/19477/1
-- 
To view, visit http://gerrit.cloudera.org:8080/19477
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: asf-site
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibbb491fe2ec3c15305c3c66f1e8857a98fced10f
Gerrit-Change-Number: 19477
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 


[Impala-ASF-CR] IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

2023-02-06 Thread Jason Fehr (Code Review)
Jason Fehr has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/19428 )

Change subject: IMPALA-11850 Adds HTTP tracing headers when using the hs2-http 
protocol.
..

IMPALA-11850 Adds HTTP tracing headers when using the hs2-http protocol.

When using the hs2 protocol with the http transport, include several
tracing http headers by default.  These headers are:

  * X-Request-Id-- client defined string that identifies the
   http request, this string is meaningful only
   to the client
  * X-Impala-Session-Id -- session id generated by the Impala backend,
   will be omitted on http calls that occur
   before this id has been generated
  * X-Impala-Query-Id   -- query id generated by the Impala backend,
   will be omitted on http calls that occur
   before this id has been generated

The Impala shell includes these headers by default.  The command
line argument --no_http_tracing has been added to remove these
headers.

The Impala backend logs out these headers if they are on the http
request.  The log messages are written out at log level 2 (RPC).

Testing:
  - manual testing (verified using debugging proxy and impala logs)
  - new python test

Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
---
M be/src/transport/THttpServer.cpp
M be/src/transport/THttpServer.h
M shell/ImpalaHttpClient.py
M shell/impala_client.py
M shell/impala_shell.py
M shell/impala_shell_config_defaults.py
M shell/option_parser.py
M tests/common/test_dimensions.py
M tests/shell/test_shell_commandline.py
9 files changed, 267 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/28/19428/9
--
To view, visit http://gerrit.cloudera.org:8080/19428
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7857eb5ec03eba32e06ec8d4133480f2e958ad2f
Gerrit-Change-Number: 19428
Gerrit-PatchSet: 9
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 


[Impala-ASF-CR] IMPALA-11900 Test table iceberg partitioned orc has wrong metadata

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19476 )

Change subject: IMPALA-11900 Test table iceberg_partitioned_orc has wrong 
metadata
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12316/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19476
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae860f401947092d9fdca802f41dd6de79e0638d
Gerrit-Change-Number: 19476
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 06 Feb 2023 16:18:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11900 Test table iceberg partitioned orc has wrong metadata

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19476 )

Change subject: IMPALA-11900 Test table iceberg_partitioned_orc has wrong 
metadata
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9027/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19476
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iae860f401947092d9fdca802f41dd6de79e0638d
Gerrit-Change-Number: 19476
Gerrit-PatchSet: 2
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Gergely Fürnstáhl 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tamas Mate 
Gerrit-Comment-Date: Mon, 06 Feb 2023 15:59:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11900 Test table iceberg partitioned orc has wrong metadata

2023-02-06 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19476


Change subject: IMPALA-11900 Test table iceberg_partitioned_orc has wrong 
metadata
..

IMPALA-11900 Test table iceberg_partitioned_orc has wrong metadata

Iceberg table iceberg_partitioned_orc has wrong metadata.
The field 'file_size_in_bytes' is wrong for the data files.

This causes issues on object stores where we rely more on information
coming from Iceberg metadata.

This commit updates the manifest and manifest list files to reflect
correct information.

Change-Id: Iae860f401947092d9fdca802f41dd6de79e0638d
---
M 
testdata/data/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/02fb8198-e791-4d89-8afa-c75fb5443346-m0.avro
M 
testdata/data/iceberg_test/hadoop_catalog/iceberg_partitioned_orc/functional_parquet/iceberg_partitioned_orc/metadata/snap-3506237933060603670-1-02fb8198-e791-4d89-8afa-c75fb5443346.avro
2 files changed, 0 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/76/19476/1
--
To view, visit http://gerrit.cloudera.org:8080/19476
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iae860f401947092d9fdca802f41dd6de79e0638d
Gerrit-Change-Number: 19476
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19470 )

Change subject: IMPALA-11845: (Addendum) Don't specify db name in the new 
struct tests
..


Patch Set 2: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Gerrit-Change-Number: 19470
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 14:47:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19470 )

Change subject: IMPALA-11845: (Addendum) Don't specify db name in the new 
struct tests
..

IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

Some new tests are added for STAR expansion on struct types when the
table is masked by Ranger masking policies. They are tested on both
Parquet and ORC tables. However, some tests explicitly use
'functional_parquet' as the db name, which lose the coverage on ORC
tables. This patch removes the explicit db names.

Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Reviewed-on: http://gerrit.cloudera.org:8080/19470
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M 
testdata/workloads/functional-query/queries/QueryTest/ranger_column_masking_complex_types.test
1 file changed, 4 insertions(+), 4 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Gerrit-Change-Number: 19470
Gerrit-PatchSet: 3
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11081: Fix incorrect results in partition key scan

2023-02-06 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19471 )

Change subject: IMPALA-11081: Fix incorrect results in partition key scan
..


Patch Set 3:

(1 comment)

Thanks for contributing the fix!

http://gerrit.cloudera.org:8080/#/c/19471/3/tests/query_test/test_queries.py
File tests/query_test/test_queries.py:

http://gerrit.cloudera.org:8080/#/c/19471/3/tests/query_test/test_queries.py@335
PS3, Line 335:   file that contains multiple blocks"""
Can we also test other file formats that require reading file headers, e.g. 
avro?



--
To view, visit http://gerrit.cloudera.org:8080/19471
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17331ed6c26a747e0509dcbaf427cd52808943b1
Gerrit-Change-Number: 19471
Gerrit-PatchSet: 3
Gerrit-Owner: Yifan Zhang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 06 Feb 2023 13:17:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12315/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 12:42:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#3).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.
This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/3
--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19475

to look at the new patch set (#2).

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve
scan performance when cache miss happens.

Previously, writes to the data cache are synchronized with hdfs file
reads, and both are handled by remote hdfs IO threads. In other words,
if a cache miss occurs,  the IO thread needs to take additional
responsibility for cache writes,  which will lead to scan performance
deterioration.

This patch uses a thread pool for asynchronous writes, and the number of
threads in the pool is determined by the new configuration
'data_cache_num_write_threads'. In asynchronous write mode, the IO
thread only needs to copy data to the temporary buffer when storing data
into the data cache. The additional memory consumption caused by
temporary buffers can be limited, depending on the new configuration
'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original
DataCacheTest using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the
case of asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/2
--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Anonymous Coward (Code Review)
18770832...@163.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19475


Change subject: IMPALA-11886: Data cache should support asynchronous writes
..

IMPALA-11886: Data cache should support asynchronous writes

This patch implements asynchronous write to the data cache to improve scan 
performance when cache miss happens.
Previously, writes to the data cache are synchronized with hdfs file reads, and 
both are handled by remote hdfs IO threads. In other words, if a cache miss 
occurs,  the IO thread needs to take additional responsibility for cache 
writes,  which will lead to scan performance deterioration.
This patch uses a thread pool for asynchronous writes, and the number of 
threads in the pool is determined by the new configuration 
'data_cache_num_write_threads'. In asynchronous write mode, the IO thread only 
needs to copy data to the temporary buffer when storing data into the data 
cache. The additional memory consumption caused by temporary buffers can be 
limited, depending on the new configuration 'data_cache_write_buffer_limit'.

Testing:
- Add test cases for asynchronous data writing to the original DataCacheTest 
using different number of threads.
- Add DataCacheTest,#OutOfWriteBufferLimit
Used to test the limit of memory consumed by temporary buffers in the case of 
asynchronous writes

Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
---
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/util/impalad-metrics.cc
M be/src/util/impalad-metrics.h
M common/thrift/metrics.json
6 files changed, 330 insertions(+), 68 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/75/19475/1
--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>


[Impala-ASF-CR] IMPALA-11886: Data cache should support asynchronous writes

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19475 )

Change subject: IMPALA-11886: Data cache should support asynchronous writes
..


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc
File be/src/runtime/io/data-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache-test.cc@104
PS1, Line 104:   /// before any lookup when running test case in async write 
mode.
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.h@499
PS1, Line 499:   /// Limit of the total buffer size used by asynchronous store 
tasks, when the current
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@117
PS1, Line 117: DEFINE_int32(data_cache_num_write_threads, 0,
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@122
PS1, Line 122: DEFINE_string(data_cache_write_buffer_limit, "1GB",
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/runtime/io/data-cache.cc@895
PS1, Line 895: int64_t buffer_limit = 
ParseUtil::ParseMemSpec(FLAGS_data_cache_write_buffer_limit,
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc
File be/src/util/impalad-metrics.cc:

http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@84
PS1, Line 84: const char* 
ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_ACTIVE_BUFFER_BYTES =
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@86
PS1, Line 86: const char* 
ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_STORE_TASKS_CREATED =
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19475/1/be/src/util/impalad-metrics.cc@88
PS1, Line 88: const char* 
ImpaladMetricKeys::IO_MGR_REMOTE_DATA_CACHE_OUT_OF_BUFFER_LIMIT_BYTES =
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/19475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I878f7486d485b6288de1a9145f49576b7155d312
Gerrit-Change-Number: 19475
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 12:21:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19470 )

Change subject: IMPALA-11845: (Addendum) Don't specify db name in the new 
struct tests
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Gerrit-Change-Number: 19470
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 09:38:25 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19470 )

Change subject: IMPALA-11845: (Addendum) Don't specify db name in the new 
struct tests
..


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9026/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Gerrit-Change-Number: 19470
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 09:38:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11845: (Addendum) Don't specify db name in the new struct tests

2023-02-06 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19470 )

Change subject: IMPALA-11845: (Addendum) Don't specify db name in the new 
struct tests
..


Patch Set 1: Code-Review+2

Thanks for fixing this.


--
To view, visit http://gerrit.cloudera.org:8080/19470
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8efea5cc2e10d8ae50ee6c1201e325932cb27fbf
Gerrit-Change-Number: 19470
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 06 Feb 2023 09:38:04 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11803: Impalad hit DCHECK when running union on empty table with MT DOP>1

2023-02-06 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19474 )

Change subject: IMPALA-11803: Impalad hit DCHECK when running union on  empty 
table with MT_DOP>1
..


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19474/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19474/2//COMMIT_MSG@7
PS2, Line 7: Impalad hit DCHECK when running union on
   :  empty table with MT_DOP>1
nit: we'd better summarize how the issue is fixed instead of describing the 
issue.


http://gerrit.cloudera.org:8080/#/c/19474/2//COMMIT_MSG@11
PS2, Line 11:
nit: no space at line start


http://gerrit.cloudera.org:8080/#/c/19474/2/tests/custom_cluster/test_mt_dop.py
File tests/custom_cluster/test_mt_dop.py:

http://gerrit.cloudera.org:8080/#/c/19474/2/tests/custom_cluster/test_mt_dop.py@68
PS2, Line 68:   def test_mt_dop_union_empty_table(self, unique_database):
Please verify the test failed without the fix. I think it won't fail since the 
following annotation is missing here:

 @CustomClusterTestSuite.with_args(cluster_size=1)


http://gerrit.cloudera.org:8080/#/c/19474/2/tests/custom_cluster/test_mt_dop.py@73
PS2, Line 73: self.client.execute("create table {0}.tbl (id int) stored as 
parquet"
: .format(unique_database))
: self.client.execute("insert into {0}.tbl values (0), (1), (2)"
: .format(unique_database))
We can use an existing table (e.g. functional.alltypestiny) instead of creating 
a new one.



--
To view, visit http://gerrit.cloudera.org:8080/19474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbae5e1a78211327a214b2d936743bda767ae3c4
Gerrit-Change-Number: 19474
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 06 Feb 2023 09:04:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11803: Impalad hit DCHECK when running union on empty table with MT DOP>1

2023-02-06 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19474 )

Change subject: IMPALA-11803: Impalad hit DCHECK when running union on  empty 
table with MT_DOP>1
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12314/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idbae5e1a78211327a214b2d936743bda767ae3c4
Gerrit-Change-Number: 19474
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Mon, 06 Feb 2023 08:29:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11803: Impalad hit DCHECK when running union on empty table with MT DOP>1

2023-02-06 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has removed Joe McDonnell from this change.  ( 
http://gerrit.cloudera.org:8080/19474 )

Change subject: IMPALA-11803: Impalad hit DCHECK when running union on  empty 
table with MT_DOP>1
..


Removed reviewer Joe McDonnell.
--
To view, visit http://gerrit.cloudera.org:8080/19474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Idbae5e1a78211327a214b2d936743bda767ae3c4
Gerrit-Change-Number: 19474
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11803: Impalad hit DCHECK when running union on empty table with MT DOP>1

2023-02-06 Thread Anonymous Coward (Code Review)
pranav.lo...@cloudera.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19474


Change subject: IMPALA-11803: Impalad hit DCHECK when running union on  empty 
table with MT_DOP>1
..

IMPALA-11803: Impalad hit DCHECK when running union on
 empty table with MT_DOP>1

The error was caused as the value of useMtScanNode_
 was not getting set for empty tables. The fix is also
 verified with an end-to-end test.

Change-Id: Idbae5e1a78211327a214b2d936743bda767ae3c4
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M tests/custom_cluster/test_mt_dop.py
2 files changed, 14 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/19474/2
--
To view, visit http://gerrit.cloudera.org:8080/19474
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idbae5e1a78211327a214b2d936743bda767ae3c4
Gerrit-Change-Number: 19474
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Quanlong Huang