[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 7
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 06:13:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..

IMPALA-11945: Fix Flaky Test in JwtHttpTest

The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl test has been
flaky in the impala-asf-master-core Jenkins build. This build was
executed twice with this test passing in one run and failing in the
other.

This test and another test named
testJwtAuthWithTrustedJwksHttpsUrlInvalidCN both follow the same
pattern. They attempt to start up a cluster with incorrectly
configured Impala daemons. Then, the Impala coordinator's log file
is searched to ensure the coordinator failed to start up for the
expected reason.  Both these tests were introduced by IMPALA-11922.

The theory for the test flakiness is that the Impala coordinator logs
did not finish writing to disk before they were checked for the
expected startup failure message. The evidence backing this theory
is that all expected log messages were present in the log file except
for the final log line containing the error message.

Three changes were made to fix the flakiness
1. The tests where log files are search for startup failure messages
   now use a single coordinator. Only a single coordinator is needed
   thus this change eliminates potential issues caused by multiple
   coordinators.
2. The tests sleep 5 seconds before searching the log files to give
   time for the logs to be fully written to disk.
3. Log buffering in the Impala daemons was turned off by setting the
   logbuflevel startup flag to -1 which turns off in-memory log
   buffering and writes the logs directly to disk.

Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Reviewed-on: http://gerrit.cloudera.org:8080/19536
Reviewed-by: Impala Public Jenkins 
Tested-by: Impala Public Jenkins 
---
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
1 file changed, 65 insertions(+), 18 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 8
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-11822: Optimize the Refresh/Invalidate event processing by skipping unnecessary events

2023-02-24 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19484 )

Change subject: IMPALA-11822: Optimize the Refresh/Invalidate event processing 
by skipping unnecessary events
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@300
PS2, Line 300: HashMap
I think we need ConcurrentHashMap since this will be updated in multi-threads.


http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java
File fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java:

http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/catalog/events/MetastoreEvents.java@2516
PS1, Line 2516:
> Yeah, the event time is generated by HMS. This variable stores the latest t
Sorry, what I mean is clock skew between nodes. It's possible that the HMS and 
catalogd nodes have several milliseconds difference. If any of them have 
connection issues with the ntp server, the time difference could be even larger.


http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6362
PS1, Line 6362:   }
> Yeah, I did cover the normal refresh cases as well. Did I miss anything?
Oh, I just realized the map is updated in fireReloadEventHelper. Can we update 
the method name to reflect this, e.g. fireReloadEventAndUpdateRefreshMap?


http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6423
PS1, Line 6423:   //Update the entries in the refresh metadata map
nit: need space after '//'


http://gerrit.cloudera.org:8080/#/c/19484/1/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6434
PS1, Line 6434: //update all the partitions values
  :   for(String partitionName:
nit: need spaces after '//', 'for' and around ':'


http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java
File fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java:

http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@4513
PS2, Line 4513:   catalog_.setRefreshMetadataMap(refreshMetadataMap);
It seems we don't need to set the map back. It's updated in place.


http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6398
PS2, Line 6398:   catalog_.setRefreshMetadataMap(refreshMetadataMap);
I think we can simply set an empty map and don't need to get and clear the 
current map. Or if we get and clear the current map, we don't need to set it 
back.


http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6425
PS2, Line 6425:   new HashMap<>(catalog_.getRefreshMetadataMap());
Why do we need to create a new map here? Can we directly modify the map 
corresponding to the table?


http://gerrit.cloudera.org:8080/#/c/19484/2/fe/src/main/java/org/apache/impala/service/CatalogOpExecutor.java@6444
PS2, Line 6444:   catalog_.setRefreshMetadataMap(refreshMetadataMap);
This might have concurrent issue when there are concurrent refreshes on 
different tables.



--
To view, visit http://gerrit.cloudera.org:8080/19484
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I905957683a96c3ea01ab4bf043d6658ce37b7574
Gerrit-Change-Number: 19484
Gerrit-PatchSet: 2
Gerrit-Owner: Sai Hemanth Gantasala 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Sai Hemanth Gantasala 
Gerrit-Comment-Date: Sat, 25 Feb 2023 05:29:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19535 )

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12445/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 25 Feb 2023 05:14:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Quanlong Huang (Code Review)
Quanlong Huang has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19535 )

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..


Patch Set 2:

(2 comments)

Thanks for your quick review, Daniel!

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@75
PS1, Line 75: }
> Shouldn't this also be IR_ALWAYS_INLINE? It calls StringFunctions::Substrin
I realized that we don't need this annotation in functions that have registered 
their symbols in FE, i.e. builtin functions. The reason is that 
LlvmCodeGen::InlineConstFnAttrs() is invoked in 
ScalarFnCall::GetCodegendComputeFnImpl() which will recursively invoke the same 
method in children:
https://github.com/apache/impala/blob/f54b3c37577cabad396c3eee7d18a1a6e4eea9f1/be/src/exprs/scalar-fn-call.cc#L335
The calls in children will invoke InlineConstFnAttrs() for themselves.

We just need the IR_ALWAYS_INLINE annotation for functions called by builtin 
functions. If the function itself is a builtin function, it needs the 
annotation as well if it's invoked by other builtin functions.

I also verified that the following query doesn't have performance regression:

 select count(*) from tpch100_parquet.lineitem
 where instr(l_comment, 'egular courts above the', 1, 1) > 0;

while the following query has perf regression:

 select count(*) from tpch100_parquet.lineitem
 where instr(l_comment, 'egular courts above the') > 0;

The first one uses the 4 arguments INSTR function directly so it's ok. The 
second one uses the 2 arguments INSTR function which is a wrapper of the 4 
arguments INSTR function. The callee function is not inlined so causes the 
regression.


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@850
PS1, Line 850:   }
> Shouldn't these Instr overloads, Locate and LocatePos also be IR_ALWAYS_INL
Replied above.



--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Sat, 25 Feb 2023 04:54:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Quanlong Huang (Code Review)
Hello Daniel Becker, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19535

to look at the new patch set (#2).

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..

IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE

String functions that have both UTF-8 and the traditional ASCII
behaviors have checks for the UTF8_MODE query option. The check is
intended to be replaced with constants during codegen in
LlvmCodeGen::InlineConstFnAttrs().

However, as mentioned in the method comment, InlineConstFnAttrs() only
replaces call instructions inside the current function. To replace the
call to FunctionContextImpl::GetConstFnAttr() inside the callee
functions, we have to inline the callee functions (by annotating them
with IR_ALWAYS_INLINE).

This patch annotates UTF-8 related string functions with
IR_ALWAYS_INLINE to make sure the checks on UTF8_MODE are all replaced
in codegen. Note that builtin functions don't need the annotation if
they are not invoked by other builtin functions, because
InlineConstFnAttrs() will be invoked recursively in the expression tree.
See ScalarFnCall::GetCodegendComputeFnImpl().

Perf tests:
Ran PERF_STRING queries in targeted-perf (scale factor 100) on parquet
format. Saw significant improvements:
+-+--++-++
| Query   | File Format  | Avg(s) | Base Avg(s) | Delta(Avg) |
+-+--++-++
| PERF_STRING-Q7  | parquet/none | 10.80  | 10.93   |   -1.13%   |
| PERF_STRING-Q6  | parquet/none | 12.25  | 12.55   |   -2.37%   |
| PERF_STRING-Q5  | parquet/none | 5.87   | 6.01|   -2.33%   |
| PERF_STRING-Q3  | parquet/none | 5.02   | 5.26|   -4.47%   |
| PERF_STRING-Q2  | parquet/none | 4.49   | 4.73|   -4.88%   |
| PERF_STRING-Q4  | parquet/none | 4.99   | 5.28| I -5.48%   |
| PERF_STRING-Q1  | parquet/none | 4.02   | 4.29| I -6.34%   |
+-+--++-++
| PERF_STRING-Q13 | parquet/none | 9.90   | 11.48   | I -13.72%  |
| PERF_STRING-Q9  | parquet/none | 10.01  | 11.64   | I -13.97%  |
| PERF_STRING-Q11 | parquet/none | 9.83   | 11.56   | I -14.97%  |
| PERF_STRING-Q10 | parquet/none | 6.62   | 8.07| I -17.89%  |
| PERF_STRING-Q12 | parquet/none | 6.63   | 8.25| I -19.61%  |
| PERF_STRING-Q8  | parquet/none | 5.03   | 6.45| I -22.09%  |
+-+--++-++
Note that Q8-Q13 are new queries added by this patch to reveal the
performance difference.

Ran the same queries comparing the branch of reverting the UTF-8 changes
(IMPALA-2019). Found no regressions anymore. Even see improvement in one
query (Q13):
+-+--++-++
| Query   | File Format  | Avg(s) | Base Avg(s) | Delta(Avg) |
+-+--++-++
| PERF_STRING-Q11 | parquet/none | 9.85   | 9.56|   +3.04%   |
| PERF_STRING-Q7  | parquet/none | 10.76  | 10.59   |   +1.57%   |
| PERF_STRING-Q1  | parquet/none | 4.04   | 4.00|   +0.83%   |
| PERF_STRING-Q4  | parquet/none | 5.02   | 5.00|   +0.51%   |
| PERF_STRING-Q2  | parquet/none | 4.53   | 4.51|   +0.32%   |
| PERF_STRING-Q5  | parquet/none | 5.81   | 5.81|   +0.02%   |
| PERF_STRING-Q8  | parquet/none | 5.02   | 5.04|   -0.31%   |
| PERF_STRING-Q12 | parquet/none | 6.65   | 6.68|   -0.43%   |
| PERF_STRING-Q3  | parquet/none | 5.02   | 5.05|   -0.53%   |
| PERF_STRING-Q6  | parquet/none | 12.23  | 12.33   |   -0.84%   |
| PERF_STRING-Q10 | parquet/none | 6.68   | 6.74|   -0.87%   |
| PERF_STRING-Q9  | parquet/none | 10.01  | 10.41   |   -3.86%   |
| PERF_STRING-Q13 | parquet/none | 9.92   | 10.54   | I -5.85%   |
+-+--++-++

Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
---
M be/src/exprs/mask-functions-ir.cc
M be/src/exprs/string-functions-ir.cc
M testdata/workloads/targeted-perf/queries/string.test
3 files changed, 117 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/19535/2
--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate 
selectivities
..


Patch Set 13:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12444/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 
Gerrit-Comment-Date: Sat, 25 Feb 2023 04:22:57 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

2023-02-24 Thread wangsheng (Code Review)
wangsheng has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate 
selectivities
..


Patch Set 13:

(29 comments)

> Can you please provide feedback on all comments?
 >
 > For example, for the  following comment, a feedback can be DONE, a
 > reply etc.
 >
 > Commit Message
 > Line 14:
 > TABLE_NUM_ROWS?
 >
 > The feedbacks are useful to judge the rework. Thanks!

Ok, Qifan, thanks for reply. I've already answer all comments.

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java
File fe/src/main/java/org/apache/impala/analysis/Predicate.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/Predicate.java@78
PS9, Line 78: ti
> nit. should be a double value in (0, 1].
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java
File fe/src/main/java/org/apache/impala/analysis/TableRef.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@547
PS9, Line 547:   return;
> The above hints are all limited to hdfs tables since they are related to hd
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@554
PS9, Line 554: }
> nit. for this HDFS table reference. On paper, one can assign different # ro
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/analysis/TableRef.java@555
PS9, Line 555:
> It seems we need to handle NumberFormatException thrown from this method. F
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@81
PS9, Line 81: H
> nit. "."
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/main/java/org/apache/impala/planner/ScanNode.java@82
PS9, Line 82: ng tableNumRowsHint_ = -1;
:
> I wonder if this is still correct. I thought the hint can be used to overri
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
File fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java:

http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5053
PS9, Line 5053:  void testCreatePartitio
> This error may not be user friendly for super long SQL query.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5056
PS9, Line 5056: "PARTITIONED BY SPEC (BUC
> same comment as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5060
PS9, Line 5060:  "STORED AS ICEBERG" + tb
> same comment as above.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5064
PS9, Line 5064:  int,
> nit. may add a qualifier "non-HDFS" before 'table' to make it clear.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5071
PS9, Line 5071: "PARTITIONED BY SPEC (BU
> See the comment on TABLE_NUM_ROWS for "syntax error in line 1".
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5074
PS9, Line 5074: "PARTITIONED BY SPEC (TRUNCATE(0, p1), DAY(p2)) STORED 
AS ICEBERG" +
> Isn't this supported now in patch set 9?
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5085
PS9, Line 5085:  Legal hint return correct
> is this a mistake? 0.1 is a perfect selectivity value.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5087
PS9, Line 5087: zesOk("select * from tp
> Same as above. 0 is okay.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5093
PS9, Line 5093:
> should be [0,1], right?
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5098
PS9, Line 5098: gal
> See the comment about. Should be allowed.
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java@5102
PS9, Line 5102:
  : // Multiple illegal hints wil
> From the updated parser, it seems this hint should be accepted in that the
Done


http://gerrit.cloudera.org:8080/#/c/18023/9/testdata/workloads/functional-planner/queries/PlannerTest/hdfs-cardinality-hint.test
File 

[Impala-ASF-CR] IMPALA-7942 (part 2): Add query hints for predicate selectivities

2023-02-24 Thread wangsheng (Code Review)
wangsheng has uploaded a new patch set (#13). ( 
http://gerrit.cloudera.org:8080/18023 )

Change subject: IMPALA-7942 (part 2): Add query hints for predicate 
selectivities
..

IMPALA-7942 (part 2): Add query hints for predicate selectivities

Currently, Impala only uses simple estimation to compute selectivity
for some predicates, and this may lead to worse query plan due to CBO.
Hence, we add new hints to reduce such errors. Maybe in the future,
we can use histograms to get more precise query plan.

This patch adds another query hints: 'SELECTIVITY', we can use this
hint to original selectivity computing.

Format like this:

  select col from t where (a=1) /* +SELECTIVITY(0.5) */;

Besides, this hint is also valid for compound predicate like this:

  select col from t where (a=1 and b=2) /* +SELECTIVITY(0.5) */;

But pay attention, if we want to use 'SELECTIVITY' hint for predicate,
we need to wrap the predicate by braket, even for single binary
predicate.

Testing:
- Added new fe tests in 'PlannerTest'
- Added new fe tests in 'AnalyzeStmtsTest' for negative cases

Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
---
M fe/src/main/cup/sql-parser.cup
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Predicate.java
M fe/src/main/java/org/apache/impala/rewrite/BetweenToCompoundRule.java
M fe/src/main/jflex/sql-scanner.flex
M fe/src/test/java/org/apache/impala/analysis/AnalyzeStmtsTest.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A 
testdata/workloads/functional-planner/queries/PlannerTest/predicate-selectivity-hint.test
12 files changed, 375 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/23/18023/13
--
To view, visit http://gerrit.cloudera.org:8080/18023
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2776b9bbd878b8a21d9c866b400140a454f59e1b
Gerrit-Change-Number: 18023
Gerrit-PatchSet: 13
Gerrit-Owner: wangsheng 
Gerrit-Reviewer: Amogh Margoor 
Gerrit-Reviewer: Fucun Chu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Reviewer: wangsheng 


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 6:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12443/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 6
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 01:25:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 7
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 01:06:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9072/ 
DRY_RUN=false


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 7
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 01:06:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 6: Code-Review+2

Looks good, thanks!
Carry Andrew's +1.


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 6
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 01:05:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..

IMPALA-11945: Fix Flaky Test in JwtHttpTest

The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl test has been
flaky in the impala-asf-master-core Jenkins build. This build was
executed twice with this test passing in one run and failing in the
other.

This test and another test named
testJwtAuthWithTrustedJwksHttpsUrlInvalidCN both follow the same
pattern. They attempt to start up a cluster with incorrectly
configured Impala daemons. Then, the Impala coordinator's log file
is searched to ensure the coordinator failed to start up for the
expected reason.  Both these tests were introduced by IMPALA-11922.

The theory for the test flakiness is that the Impala coordinator logs
did not finish writing to disk before they were checked for the
expected startup failure message. The evidence backing this theory
is that all expected log messages were present in the log file except
for the final log line containing the error message.

Three changes were made to fix the flakiness
1. The tests where log files are search for startup failure messages
   now use a single coordinator. Only a single coordinator is needed
   thus this change eliminates potential issues caused by multiple
   coordinators.
2. The tests sleep 5 seconds before searching the log files to give
   time for the logs to be fully written to disk.
3. Log buffering in the Impala daemons was turned off by setting the
   logbuflevel startup flag to -1 which turns off in-memory log
   buffering and writes the logs directly to disk.

Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
---
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
1 file changed, 65 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/19536/6
--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 6
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19536/5/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/5/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@443
PS5, Line 443: List logLines = null;
 : Matcher> m = 
hasItem(containsString(expectedErrString));
 :
 : // writing logs to disk may take some time, try a few times 
to search for the
 : // expected error in the log
 : for (int i=0; i<10; i++) {
 :   logLines = 
Files.readAllLines(logDir.resolve("impalad.ERROR"));
 :   if (m.matches(logLines)) {
 : break;
 :   }
 :   Thread.sleep(250);
 : }
 :
 : // runs the matcher one more time to ensure a descriptive 
failure message is
 : // generated if the assert fails
> nit: This code is duplicated in two places. Maybe put this into its own met
Done



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 01:04:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP IMPALA-11947: Bump GBN to get Iceberg change #6074

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19538 )

Change subject: WIP IMPALA-11947: Bump GBN to get Iceberg change #6074
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12442/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c7a3564c51dcd5216b811b66ad45f0159fec817
Gerrit-Change-Number: 19538
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:38:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 5:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12441/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:28:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 5: Code-Review+1

LGTM, I'll let Riza take this to +2


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:22:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 7: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:19:16 +
Gerrit-HasComments: No


[Impala-ASF-CR] WIP IMPALA-11947: Bump GBN to get Iceberg change #6074

2023-02-24 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19538


Change subject: WIP IMPALA-11947: Bump GBN to get Iceberg change #6074
..

WIP IMPALA-11947: Bump GBN to get Iceberg change #6074

We need Iceberg change #6074 (Allow using SnapshotManager in Transaction)
to fix IMPALA-11482.

Bump CDP_BUILD_NUMBER (GBN) to 38235009
Bump various CDP version numbers to be based on 7.2.17.0-127
Bump Iceberg version to 1.1.0

TESTING:

I am running exhaustive tests...

Change-Id: I3c7a3564c51dcd5216b811b66ad45f0159fec817
---
M bin/impala-config.sh
1 file changed, 11 insertions(+), 11 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/38/19538/1
--
To view, visit http://gerrit.cloudera.org:8080/19538
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3c7a3564c51dcd5216b811b66ad45f0159fec817
Gerrit-Change-Number: 19538
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 5:

(1 comment)

Thanks Jason, just have 1 more nit.

http://gerrit.cloudera.org:8080/#/c/19536/5/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/5/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@443
PS5, Line 443: List logLines = null;
 : Matcher> m = 
hasItem(containsString(expectedErrString));
 :
 : // writing logs to disk may take some time, try a few times 
to search for the
 : // expected error in the log
 : for (int i=0; i<10; i++) {
 :   logLines = 
Files.readAllLines(logDir.resolve("impalad.ERROR"));
 :   if (m.matches(logLines)) {
 : break;
 :   }
 :   Thread.sleep(250);
 : }
 :
 : // runs the matcher one more time to ensure a descriptive 
failure message is
 : // generated if the assert fails
nit: This code is duplicated in two places. Maybe put this into its own method?



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:16:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@416
PS4, Line 416: testJwtAuthWithUntrustedJwksHttpsUrl
> Can you try loop this test in your local machine? just to make sure it is n
good idea, I ran the test 100 times and it did not fail once



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:09:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19536/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19536/4//COMMIT_MSG@9
PS4, Line 9: The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl
> nit: mention the IMPALA-11922 in commit message.
done


http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@442
PS4, Line 442: // check in the impalad logs that the server startup failed 
for the expected reason
> Nit: We could avoid the sleep by looping a few times retrying the readAllLi
fixing to eliminate the 5 second sleep and establish a better pattern



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Sat, 25 Feb 2023 00:08:48 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..

IMPALA-11945: Fix Flaky Test in JwtHttpTest

The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl test has been
flaky in the impala-asf-master-core Jenkins build. This build was
executed twice with this test passing in one run and failing in the
other.

This test and another test named
testJwtAuthWithTrustedJwksHttpsUrlInvalidCN both follow the same
pattern. They attempt to start up a cluster with incorrectly
configured Impala daemons. Then, the Impala coordinator's log file
is searched to ensure the coordinator failed to start up for the
expected reason.  Both these tests were introduced by IMPALA-11922.

The theory for the test flakiness is that the Impala coordinator logs
did not finish writing to disk before they were checked for the
expected startup failure message. The evidence backing this theory
is that all expected log messages were present in the log file except
for the final log line containing the error message.

Three changes were made to fix the flakiness
1. The tests where log files are search for startup failure messages
   now use a single coordinator. Only a single coordinator is needed
   thus this change eliminates potential issues caused by multiple
   coordinators.
2. The tests sleep 5 seconds before searching the log files to give
   time for the logs to be fully written to disk.
3. Log buffering in the Impala daemons was turned off by setting the
   logbuflevel startup flag to -1 which turns off in-memory log
   buffering and writes the logs directly to disk.

Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
---
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
1 file changed, 61 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/19536/5
--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 5
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19536/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19536/4//COMMIT_MSG@9
PS4, Line 9: The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl
nit: mention the IMPALA-11922 in commit message.
In case we backport IMPALA-11922 somewhere else, we'll remember to include 
IMPALA-11945 as well.


http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@416
PS4, Line 416: testJwtAuthWithUntrustedJwksHttpsUrl
Can you try loop this test in your local machine? just to make sure it is not 
flaky anymore. Maybe like this from $IMPALA_HOME:

(pushd fe && for i in {1..100}; do mvn -fae test 
-Dtest=JwtHttpTest#testJwtAuthWithUntrustedJwksHttpsUrl; done)



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Comment-Date: Fri, 24 Feb 2023 22:27:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 4:

(1 comment)

I have a suggestion which you can push back on if you like

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
File fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java:

http://gerrit.cloudera.org:8080/#/c/19536/4/fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java@442
PS4, Line 442: Thread.sleep(5000);
Nit: We could avoid the sleep by looping a few times retrying the 
readAllLines+containsString with a short sleep after each try.
Impala has a LOT of tests and we want to try and keep them running as fast as 
possible. So 10 extra seconds of sleep may seem like not much, but it would be 
better to establish a pattern that doesn't involve sleeping in case other 
people copy and paste the test,



--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 22:13:57 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19537 )

Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12440/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 1
Gerrit-Owner: John Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 22:13:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11946: Add Thrift HTTP support for external frontend

2023-02-24 Thread John Sherman (Code Review)
John Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19537


Change subject: IMPALA-11946: Add Thrift HTTP support for external frontend
..

IMPALA-11946: Add Thrift HTTP support for external frontend

- Add enable_external_fe_http flag that defaults to false
  - When true the external frontend service (external_fe_port) will
  expect clients to use http transport.
  - When false the external frontend service will expect binary
  transport.

- Add tests for basic external frontend functionality
- Add test to ensure the non-external frontend services do not expose
  the ExecutePlannedStatement interface.

Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
---
M be/src/service/impala-server.cc
M fe/src/test/java/org/apache/impala/customcluster/CustomClusterRunner.java
A fe/src/test/java/org/apache/impala/customcluster/ExternalFrontendTest.java
3 files changed, 208 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/37/19537/1
--
To view, visit http://gerrit.cloudera.org:8080/19537
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2ad400b1df471e3d61b62d8c0360b27396c26050
Gerrit-Change-Number: 19537
Gerrit-PatchSet: 1
Gerrit-Owner: John Sherman 


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19536 )

Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12439/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 21:54:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11945: Fix Flaky Test in JwtHttpTest

2023-02-24 Thread Jason Fehr (Code Review)
Jason Fehr has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19536


Change subject: IMPALA-11945: Fix Flaky Test in JwtHttpTest
..

IMPALA-11945: Fix Flaky Test in JwtHttpTest

The JwtHttpTest.testJwtAuthWithUntrustedJwksHttpsUrl test has been
flaky in the impala-asf-master-core Jenkins build. This build was
executed twice with this test passing in one run and failing in the
other.

This test and another test named
testJwtAuthWithTrustedJwksHttpsUrlInvalidCN both follow the same
pattern. They attempt to start up a cluster with incorrectly
configured Impala daemons. Then, the Impala coordinator's log file
is searched to ensure the coordinator failed to start up for the
expected reason.

The theory for the test flakiness is that the Impala coordinator logs
did not finish writing to disk before they were checked for the
expected startup failure message. The evidence backing this theory
is that all expected log messages were present in the log file except
for the final log line containing the error message.

Three changes were made to fix the flakiness
1. The tests where log files are search for startup failure messages
   now use a single coordinator. Only a single coordinator is needed
   thus this change eliminates potential issues caused by multiple
   coordinators.
2. The tests sleep 5 seconds before searching the log files to give
   time for the logs to be fully written to disk.
3. Log buffering in the Impala daemons was turned off by setting the
   logbuflevel startup flag to -1 which turns off in-memory log
   buffering and writes the logs directly to disk.

Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
---
M fe/src/test/java/org/apache/impala/customcluster/JwtHttpTest.java
1 file changed, 37 insertions(+), 7 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/36/19536/4
--
To view, visit http://gerrit.cloudera.org:8080/19536
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I26d8c2cffa7bc095b81eb572709f55b04ac1fa67
Gerrit-Change-Number: 19536
Gerrit-PatchSet: 4
Gerrit-Owner: Jason Fehr 


[native-toolchain-CR] IMPALA-11944: Add SLES15 support

2023-02-24 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19528 )

Change subject: IMPALA-11944: Add SLES15 support
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19528/3/source/kudu/build.sh
File source/kudu/build.sh:

http://gerrit.cloudera.org:8080/#/c/19528/3/source/kudu/build.sh@125
PS3, Line 125:   # Set JAVA_HOME to the system java. Kudu has a bunch of 
heuristics for trying to
I can remove this with the JDK 8 we ended up with.


http://gerrit.cloudera.org:8080/#/c/19528/3/source/python/build.sh
File source/python/build.sh:

http://gerrit.cloudera.org:8080/#/c/19528/3/source/python/build.sh@58
PS3, Line 58: wrap $LOCAL_INSTALL/bin/python -c 'import bz2; import 
readline; from urllib2 import HTTPSHandler; from httplib import HTTPConnection'
This and the thrift/build.sh could be handled in a different patch.



--
To view, visit http://gerrit.cloudera.org:8080/19528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie27521cdbec2f7e966e70c2b0a954d391bf1500d
Gerrit-Change-Number: 19528
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 
Gerrit-Comment-Date: Fri, 24 Feb 2023 21:32:12 +
Gerrit-HasComments: Yes


[native-toolchain-CR] IMPALA-11944: Add SLES15 support

2023-02-24 Thread Michael Smith (Code Review)
Hello Laszlo Gaal,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19528

to look at the new patch set (#3).

Change subject: IMPALA-11944: Add SLES15 support
..

IMPALA-11944: Add SLES15 support

Adds support for building with SLES 15 SP4.

Adds a container definition for SLES 15 based on public repositories.
SLES 15 SP4 does not provide Python 2 or Java 8 JDK, so we use openSUSE
Leap 15.4 distribution packages for these.

Fixes several invocations of python that unexpectedly used the system
python instead of the python version built in the native-toolchain.

Change-Id: Ie27521cdbec2f7e966e70c2b0a954d391bf1500d
---
M Makefile
A docker/sles15.df
M in-docker.py
M source/kudu/build.sh
M source/python/build.sh
M source/thrift/build.sh
6 files changed, 73 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/28/19528/3
--
To view, visit http://gerrit.cloudera.org:8080/19528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie27521cdbec2f7e966e70c2b0a954d391bf1500d
Gerrit-Change-Number: 19528
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/schedule-state.h
File be/src/scheduling/schedule-state.h:

http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/schedule-state.h@288
PS2, Line 288: The assumption is that all
 :   /// executors have the same memory limit for admission and all 
the coordinators have the
 :   /// same memory limit for admission.
> Is the assumption working if one node is set as both executor and coordinat
Thanks Wenzhe, that's a good point. I think we could have couple of cases:
- Impala with dedicated coordinators and executors
- Impala with all nodes acting as both coordinators and executors
- Impala with some nodes acting as dedicated coordinators and dedicated 
executors while others acting as both

I think the third case is a bit tricky to handle. Let me think about this a bit 
more and rework the patch.



--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 21:00:17 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 52:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12438/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If32dc770dfffcdd0be2ba789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 52
Gerrit-Owner: Qifan Chen 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Qifan Chen 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 20:57:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..


Patch Set 52:

(25 comments)

ps52 is a rebase and address some comments in ps51.

http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@151
PS51, Line 151:
> nit plan fragment, which is blocking since it has 3 blocking PlanNode:
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@171
PS51, Line 171:
  : In bottom-up direction, there exist four segments in F03:
  :   Blocking segment 1: (11:EXCHANGE, 12:AGGREGATE)
  :   Blocking segment 2: 06
> indentation and with some additional info as follows.
Done. I rather keep calling segment 4 as non-blocking segment since only 
JoinBuildSink is considered as blocking DataSink.

This has correlation with definition of blocking fragment. All fragment has 
DataSink. But fragment without blocking PlanNode nor blocking DataSink is not 
blocking fragment.


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@177
PS51, Line 177:
  : Therefore we have:
  :   PC(segment 1) = 426337+34548320
  :   PC(segment 2) =
> indentation
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@182
PS51, Line 182:   PC(segment 4) = 22
> nit a
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@183
PS51, Line 183:
  : These per-segment costs stored in a SegmentCost tree rooted at
  : PlanFragment.rootSegment_, and ar
> nit. , and are [34974657, 2159270, 23752870, 22] respectively after the pos
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@189
PS51, Line 189: PlanFragment.collectSegmentCostHelper().
> I think "Output ProcessingCost" should be really called "Total Processing c
Removed this paragraph and refer the cost directly as "the last segment's 
ProcessingCost".

I only consider the last segment rather than the total over all segment in 
fragment to anticipate for burst exchange scenario. For example, fragment that 
only do aggregate may spend long time during aggregation. But when it is ready 
to send rows upward, the receiver fragment above it should have similar EDoP to 
keep-up with the sender.


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@196
PS51, Line 196: hat fragment by compa
> nit. effective degree of parallelism (EDoP). We can use EDoP in the rest of
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@199
PS51, Line 199:  UnionNode, or
  : ScanNode will
> the costing algorithm attempts to adjust the number of
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@201
PS51, Line 201:
> see the previous comment.
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@205
PS51, Line 205:
> Assume that segments are modeled as a list in a plan fragment (true?) in th
No, SegmentCost is modelled as tree.


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@209
PS51, Line 209:  in a similar post-order
> EDoP
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@261
PS51, Line 261:
> nit suggest to remove since a query plan with a sink node, which is blockin
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@263
PS51, Line 263: f a query or the query itself. Each blocking
  : subtree will
> the intermediate or leaf nodes.
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@268
PS51, Line 268: ample is [4, 4,
> By reading this para, it seems CoreCount is a better name.  Usually a requi
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@269
PS51, Line 269:
> nit remove
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@286
PS51, Line 286: control the entire com
> EDoP
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@288
PS51, Line 288: m or not.
> nit. remove
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@288
PS51, Line 288:Control whether to enable this CPU costing algorithm or not.
> set
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@292
PS51, Line 292:  instances (threads) that
> the entire computation of EDoP.
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@300
PS51, Line 300: ount of
> computing
Done


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@301
PS51, Line 301:  is suggested to keep this to
  :false until the min_processing_per_thread backend fl
> I strongly suggest that we introduce PROCESSING_COST_MIN_THREADS in this pa
Ack. Will take a look at this.


http://gerrit.cloudera.org:8080/#/c/19033/51//COMMIT_MSG@304
PS51, Line 304:
  : This patch also adds three backend flags to tune the algorithm.
  : 1. query_cpu_count_divisor
  :Divide the CPU requirement of a 

[Impala-ASF-CR] IMPALA-11604 Planner changes for CPU usage

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has uploaded a new patch set (#52) to the change originally 
created by Qifan Chen. ( http://gerrit.cloudera.org:8080/19033 )

Change subject: IMPALA-11604 Planner changes for CPU usage
..

IMPALA-11604 Planner changes for CPU usage

This patch augments IMPALA-10992 by establishing an infrastructure to
allow the weighted total amount of data to process to be used as a new
factor in the definition and selection of an executor group. At the
basis of the CPU costing model, we define ProcessingCost as a cost for a
distinct PlanNode / DataSink / PlanFragment to process its input rows
globally across all of its instances. The costing algorithm then tries
to adjust the number of instances for each fragment by considering their
production-consumption ratio, and then finally returns a number
representing an ideal CPU core count required for a query to run
efficiently. A more detailed explanation of the CPU costing algorithm
can be found in the four steps below.

I. Compute ProcessingCost for each plan node and data sink.

ProcessingCost of a PlanNode/DataSink is a weighted amount of data
processed by that node/sink. The basic ProcessingCost is computed with a
general formula as follows.

  ProcessingCost is a pair: PC(D, N), where D = I * (C + M)

  where D is the weighted amount of data processed
I is the input cardinality
C is the expression evaluation cost per row.
  Set to total weight of expression evaluation in node/sink.
M is a materialization cost per row.
  Only used by scan and exchange node. Otherwise, 0.
N is the number of instances.
  Default to D / MIN_COST_PER_THREAD (1 million), but
  is fixed for a certain node/sink and adjustable in step III.

In this patch, the weight of each expression evaluation is set to a
constant of 1. A description of the computation for each kind of
PlanNode/DataSink is given below.

01. AggregationNode:
Each AggregateInfo has its C as a sum of grouping expression and
aggregate expression and then assigned a single ProcessingCost
individually. These ProcessingCosts then summed to be the Aggregation
node's ProcessingCost;

02. AnalyticEvalNode:
C is the sum of the evaluation costs for analytic functions;

03. CardinalityCheckNode:
Use the general formula, I = 1;

04. DataSourceScanNode:
Follow the formula from the superclass ScanNode;

05. EmptySetNode:
  I = 0;

06. ExchangeNode:
  M = (average serialized row size) / 1024

A modification of the general formula when in broadcast mode:
  D = D * number of receivers;

07. HashJoinNode:
  probe cost = PC(I0 * C(equiJoin predicate),  N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equi-join predicate), N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

08. HbaseScanNode, HdfsScanNode, and KuduScanNode:
Follow the formula from the superclass ScanNode;

09. Nested loop join node:
When the right child is not a SingularRowSrcNode:

  probe cost = PC(I0 * C(equiJoin predicate), N)  +
   PC(output cardinality * C(otherJoin predicate), N)
  build cost = PC(I1 * C(equiJoin predicate), N)

When the right child is a SingularRowSrcNode:

  probe cost = PC(I0, N)
  build cost = PC(I0 * I1, N)

With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.

10. ScanNode:
  M = (average row size) / 1024;

11. SelectNode:
Use the general formula;

12. SingularRowSrcNode:
Since the node is involved once per input in nested loop join, the
contribution of this node is computed in nested loop join;

13. SortNode:
C is the evaluation cost for the sort expression;

14. SubplanNode:
C is 1. I is the multiplication of the cardinality of the left and
the right child;

15. Union node:
C is the cost of result expression evaluation from all non-pass-through
children;

16. Unnest node:
I is the cardinality of the containing SubplanNode and C is 1.

17. DataStreamSink:
  M = 1 / num rows per batch.

18. JoinBuildSink:
ProcessingCost is the build cost of its associated JoinNode.

29. PlanRootSink:
If result spooling is enabled, C is the cost of output expression
evaluation. Otherwise. ProcessingCost is zero.

20. TableSink:
C is the cost of output expression evaluation.
TableSink subclasses (including HBaseTableSink, HdfsTableSink, and
KuduTableSink) follows the same formula;

II. Compute the total ProcessingCost of a 

[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 20:09:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 7:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12437/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 19:26:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Wenzhe Zhou (Code Review)
Wenzhe Zhou has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/admission-controller-test.cc
File be/src/scheduling/admission-controller-test.cc:

http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/admission-controller-test.cc@877
PS2, Line 877: set_is_executor(false);
If the node is set as executor and coordinator, the assumption mat not work.


http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/schedule-state.h
File be/src/scheduling/schedule-state.h:

http://gerrit.cloudera.org:8080/#/c/19533/2/be/src/scheduling/schedule-state.h@288
PS2, Line 288: The assumption is that all
 :   /// executors have the same memory limit for admission and all 
the coordinators have the
 :   /// same memory limit for admission.
Is the assumption working if one node is set as both executor and coordinator?



--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 19:16:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 7:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/9071/ 
DRY_RUN=true


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 19:11:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Michael Smith (Code Review)
Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 7: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 19:08:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Joe McDonnell (Code Review)
Joe McDonnell has uploaded a new patch set (#7) to the change originally 
created by Csaba Ringhofer. ( http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..

IMPALA-11624: Bump Impyla dependency to 0.18.0

IMPALA_THRIFT_PY_VERSION is also bumped to 0.16.0p3.
As 0.16.0p3 Thrift does not contain Python related
patches and Impyla 0.18.0 depends on Thrift 0.16.0,
now we are consistently using Thrift 0.16.0 in all
Python code. This also bumps the Thrift in the
shell's ext-py directory to 0.16.0 (based on the
Thrift 0.16.0 pypi tarball with the egg directory
removed).

Testing:
 - Ran a GVO job

Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
---
M bin/impala-config.sh
M bin/rat_exclude_files.txt
M infra/python/deps/requirements.txt
M shell/.gitignore
D shell/ext-py/thrift-0.14.2/CMakeLists.txt
D shell/ext-py/thrift-0.14.2/Makefile.am
D shell/ext-py/thrift-0.14.2/coding_standards.md
D shell/ext-py/thrift-0.14.2/compat/win32/stdint.h
D shell/ext-py/thrift-0.14.2/setup.cfg
D shell/ext-py/thrift-0.14.2/test/_import_local_thrift.py
D shell/ext-py/thrift-0.14.2/test/test_thrift_file/TestServer.thrift
D shell/ext-py/thrift-0.14.2/test/thrift_TBinaryProtocol.py
D shell/ext-py/thrift-0.14.2/test/thrift_TCompactProtocol.py
D shell/ext-py/thrift-0.14.2/test/thrift_TNonblockingServer.py
D shell/ext-py/thrift-0.14.2/test/thrift_TZlibTransport.py
D shell/ext-py/thrift-0.14.2/test/thrift_json.py
D shell/ext-py/thrift-0.14.2/test/thrift_transport.py
R shell/ext-py/thrift-0.16.0/MANIFEST.in
R shell/ext-py/thrift-0.16.0/README.md
A shell/ext-py/thrift-0.16.0/setup.cfg
R shell/ext-py/thrift-0.16.0/setup.py
R shell/ext-py/thrift-0.16.0/src/TMultiplexedProcessor.py
R shell/ext-py/thrift-0.16.0/src/TRecursive.py
R shell/ext-py/thrift-0.16.0/src/TSCons.py
R shell/ext-py/thrift-0.16.0/src/TSerialization.py
R shell/ext-py/thrift-0.16.0/src/TTornado.py
R shell/ext-py/thrift-0.16.0/src/Thrift.py
R shell/ext-py/thrift-0.16.0/src/__init__.py
R shell/ext-py/thrift-0.16.0/src/compat.py
R shell/ext-py/thrift-0.16.0/src/ext/binary.cpp
R shell/ext-py/thrift-0.16.0/src/ext/binary.h
R shell/ext-py/thrift-0.16.0/src/ext/compact.cpp
R shell/ext-py/thrift-0.16.0/src/ext/compact.h
R shell/ext-py/thrift-0.16.0/src/ext/endian.h
R shell/ext-py/thrift-0.16.0/src/ext/module.cpp
R shell/ext-py/thrift-0.16.0/src/ext/protocol.h
R shell/ext-py/thrift-0.16.0/src/ext/protocol.tcc
R shell/ext-py/thrift-0.16.0/src/ext/types.cpp
R shell/ext-py/thrift-0.16.0/src/ext/types.h
R shell/ext-py/thrift-0.16.0/src/protocol/TBase.py
R shell/ext-py/thrift-0.16.0/src/protocol/TBinaryProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/TCompactProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/THeaderProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/TJSONProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/TMultiplexedProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/TProtocol.py
R shell/ext-py/thrift-0.16.0/src/protocol/TProtocolDecorator.py
R shell/ext-py/thrift-0.16.0/src/protocol/__init__.py
R shell/ext-py/thrift-0.16.0/src/server/THttpServer.py
R shell/ext-py/thrift-0.16.0/src/server/TNonblockingServer.py
R shell/ext-py/thrift-0.16.0/src/server/TProcessPoolServer.py
R shell/ext-py/thrift-0.16.0/src/server/TServer.py
R shell/ext-py/thrift-0.16.0/src/server/__init__.py
R shell/ext-py/thrift-0.16.0/src/transport/THeaderTransport.py
R shell/ext-py/thrift-0.16.0/src/transport/THttpClient.py
R shell/ext-py/thrift-0.16.0/src/transport/TSSLSocket.py
R shell/ext-py/thrift-0.16.0/src/transport/TSocket.py
R shell/ext-py/thrift-0.16.0/src/transport/TTransport.py
R shell/ext-py/thrift-0.16.0/src/transport/TTwisted.py
R shell/ext-py/thrift-0.16.0/src/transport/TZlibTransport.py
R shell/ext-py/thrift-0.16.0/src/transport/__init__.py
R shell/ext-py/thrift-0.16.0/src/transport/sslcompat.py
R shell/ext-py/thrift-0.16.0/test/test_socket.py
R shell/ext-py/thrift-0.16.0/test/test_sslsocket.py
M shell/packaging/requirements.txt
65 files changed, 48 insertions(+), 1,375 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/56/18456/7
--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11624: Bump Impyla dependency to 0.18.0

2023-02-24 Thread Joe McDonnell (Code Review)
Joe McDonnell has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18456 )

Change subject: IMPALA-11624: Bump Impyla dependency to 0.18.0
..


Patch Set 6:

I'm going to do an upload to this that adds the ext-py changes.


--
To view, visit http://gerrit.cloudera.org:8080/18456
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7265558b0e07959c606cba73cd251c3edfcb3ed5
Gerrit-Change-Number: 18456
Gerrit-PatchSet: 6
Gerrit-Owner: Csaba Ringhofer 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 19:05:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 5: Code-Review+2

Thanks Riza!


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 17:16:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..

IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

IMPALA-11658 implements Iceberg manifest caching for Impala. This patch
adds documentation for configuring the cache(s).

Testing:
- Built docs locally

Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Reviewed-on: http://gerrit.cloudera.org:8080/19530
Reviewed-by: Daniel Becker 
Tested-by: Impala Public Jenkins 
Reviewed-by: Zoltan Borok-Nagy 
---
M docs/topics/impala_iceberg.xml
1 file changed, 60 insertions(+), 0 deletions(-)

Approvals:
  Daniel Becker: Looks good to me, but someone else must approve
  Impala Public Jenkins: Verified
  Zoltan Borok-Nagy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[native-toolchain-CR] IMPALA-11944: Add SLES15 support

2023-02-24 Thread Michael Smith (Code Review)
Hello Laszlo Gaal,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19528

to look at the new patch set (#2).

Change subject: IMPALA-11944: Add SLES15 support
..

IMPALA-11944: Add SLES15 support

Adds support for building with SLES15 SP4.

Adds a container definition for SLES15 based on public repositories.
SLES 15 SP4 does not support Python 2 so we omit that from the build
and symlink python to python3 for build dependencies. It also doesn't
provide Java 8 packages, so we use Java 11 and explicitly set JAVA_HOME
as Kudu's normal heuristics for finding JAVA_HOME didn't work.

Fixes several invocations of python that unexpectedly used the system
python instead of the python version built in the native-toolchain.

Change-Id: Ie27521cdbec2f7e966e70c2b0a954d391bf1500d
---
M Makefile
M docker/all/assert-dependencies-present.py
M docker/all/postinstall.sh
A docker/sles15.df
M in-docker.py
M source/kudu/build.sh
M source/python/build.sh
M source/thrift/build.sh
8 files changed, 76 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/native-toolchain 
refs/changes/28/19528/2
--
To view, visit http://gerrit.cloudera.org:8080/19528
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie27521cdbec2f7e966e70c2b0a954d391bf1500d
Gerrit-Change-Number: 19528
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Michael Smith 


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 5: Code-Review+1

Thanks Riza.


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 17:10:12 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 5: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/704/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 17:12:51 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 5:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml
File docs/topics/impala_iceberg.xml:

http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml@617
PS4, Line 617: gd. This f
> Maybe we should mention that it is useful for the Coordinators and CatalogD
Done


http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml@623
PS4, Line 623: 36000
> nit: people might just blindly copying the values from here. So we probably
Make sense. Changed this to 360.



--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 17:05:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Riza Suminto (Code Review)
Hello Andrew Sherman, Daniel Becker, Gabor Kaszab, Zoltan Borok-Nagy, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19530

to look at the new patch set (#5).

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..

IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

IMPALA-11658 implements Iceberg manifest caching for Impala. This patch
adds documentation for configuring the cache(s).

Testing:
- Built docs locally

Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
---
M docs/topics/impala_iceberg.xml
1 file changed, 60 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19530/5
--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11701 Part1: Don't push down predicates if already applied by Iceberg

2023-02-24 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19534 )

Change subject: IMPALA-11701 Part1: Don't push down predicates if already 
applied by Iceberg
..


Patch Set 1:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@16
PS1, Line 16: filter
Nit: filters.


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@17
PS1, Line 17: taking into account
Wouldn't "assuming" be better?


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@17
PS1, Line 17: )
Nit: closing parenthesis should come at the end of the sentence.


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@20
PS1, Line 20: An additional benefit of not down no-op predicates
Am I right that here we mean pushing down predicates to the scanner, as opposed 
to pushing it down to Iceberg? We could make this clearer, perhaps also in the 
title.


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@20
PS1, Line 20: not down
not pushing down


http://gerrit.cloudera.org:8080/#/c/19534/1//COMMIT_MSG@28
PS1, Line 28:
We could include a Testing section describing what tests have been 
added/modified.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
File fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java@2014
PS1, Line 2014: getDerivedExplainString
You could add a comment describing how this method should be used.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java:

http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@64
PS1, Line 64:   private List skippedConjuncts;
What is the relationship between 'conjuncts_', 'nonIdentityConjuncts_' and 
'skippedConjuncts_'? Is 'conjuncts_' the union of the other two? In this case 
'skippedConjuncts_' may not need to be stored (or taken in the constructor) 
separately.

Or, if the 'skippedConjuncts_' have already been removed from 'conjuncts_', the 
comment on L56 has become a bit misleading.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java@64
PS1, Line 64: skippedConjuncts
Should end with an underscore: 'skippedConjuncts_'.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
File fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java:

http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@114
PS1, Line 114: canSkipPushingDownIcebergPredicates
Should end with underscore.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@114
PS1, Line 114: canSkipPushingDownIcebergPredicates
Could make it clearer that here we mean pushing predicates down to the 
scanners, not to Iceberg.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@375
PS1, Line 375:   conjuncts_.removeAll(impalaIcebergPredicates_);
Wouldn't it be possible that some (but not all) conjuncts can be skipped? Can 
this approach be further refined (in a follow-up commit)?


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java@379
PS1, Line 379:   private List getSkippedConjuncts() {
Nit: if we forgot to call filterConjuncts() but called filterFileDescriptors(), 
it's possible that 'conjuncts_' would still contain the elements that 
getSkippedConjuncts() returns.
If we had a separate member for 'skippedConjuncts_' that would be returned by 
getSkippedConjuncts() and would be populated in filterConjuncts(), this anomaly 
would not be possible.


http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java
File fe/src/test/java/org/apache/impala/planner/PlannerTest.java:

http://gerrit.cloudera.org:8080/#/c/19534/1/fe/src/test/java/org/apache/impala/planner/PlannerTest.java@1279
PS1, Line 1279: exercising
Nit: excercising.


http://gerrit.cloudera.org:8080/#/c/19534/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test:

http://gerrit.cloudera.org:8080/#/c/19534/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test@1
PS1, Line 1: in
Nit: no need for 'in'.


http://gerrit.cloudera.org:8080/#/c/19534/1/testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test@2
PS1, Line 2: are
Nit: out.



[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 5:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/704/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 17:04:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11854: ImpalaStringWritable's underlying array can't be changed in UDFs

2023-02-24 Thread Csaba Ringhofer (Code Review)
Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19507 )

Change subject: IMPALA-11854: ImpalaStringWritable's underlying array can't be 
changed in UDFs
..


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java
File fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java:

http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@26
PS8, Line 26:  * values that map to StringValue in the BE.
Can you mention the caching behavior?


http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@59
PS8, Line 59: ;
optional:
bufferCapacity_ could be set here with getLengthFromNativeHeap()


http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@84
PS8, Line 84:   public int getCapacity() {
: return bufferCapacity_;
:   }
What happens if this is called before initialize()?


http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@94
PS8, Line 94: Arrays.copyOf(array_, newCap);
Can't this be called while array_ is still EMPTY_ARRAY?


http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@104
PS8, Line 104:   initalize();
optional: returning getLengthFromNativeHeap() would allow optimizing the case 
when the length is accessed, but the data is not


http://gerrit.cloudera.org:8080/#/c/19507/8/fe/src/main/java/org/apache/impala/hive/executor/ImpalaStringWritable.java@116
PS8, Line 116: if (EMPTY_ARRAY == array_) {
 :   initalize();
 : }
This could be skipped in this case, as the original array won't be used.


http://gerrit.cloudera.org:8080/#/c/19507/8/testdata/workloads/functional-query/queries/QueryTest/generic-java-udf.test
File 
testdata/workloads/functional-query/queries/QueryTest/generic-java-udf.test:

http://gerrit.cloudera.org:8080/#/c/19507/8/testdata/workloads/functional-query/queries/QueryTest/generic-java-udf.test@358
PS8, Line 358: select increment( cast("a" as binary));
nit: extra space


http://gerrit.cloudera.org:8080/#/c/19507/8/testdata/workloads/functional-query/queries/QueryTest/java-udf.test
File testdata/workloads/functional-query/queries/QueryTest/java-udf.test:

http://gerrit.cloudera.org:8080/#/c/19507/8/testdata/workloads/functional-query/queries/QueryTest/java-udf.test@371
PS8, Line 371: select increment("a");
Can you add a test with empty string? The UDF seems to handle this case and it 
would be nice to test if the executor works in this case. (same for the generic 
case)


http://gerrit.cloudera.org:8080/#/c/19507/8/testdata/workloads/functional-query/queries/QueryTest/java-udf.test@378
PS8, Line 378: select increment( cast("a" as binary));
nit: extra space



--
To view, visit http://gerrit.cloudera.org:8080/19507
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb28bd0dce7b0482c7abe1f61f245691fcbfe212
Gerrit-Change-Number: 19507
Gerrit-PatchSet: 8
Gerrit-Owner: Peter Rozsa 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Peter Rozsa 
Gerrit-Comment-Date: Fri, 24 Feb 2023 16:57:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Zoltan Borok-Nagy (Code Review)
Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 4:

(2 comments)

Thanks for adding docs for it!

http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml
File docs/topics/impala_iceberg.xml:

http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml@617
PS4, Line 617: for Impala
Maybe we should mention that it is useful for the Coordinators and CatalogD as 
well.


http://gerrit.cloudera.org:8080/#/c/19530/4/docs/topics/impala_iceberg.xml@623
PS4, Line 623: 6
nit: people might just blindly copying the values from here. So we probably 
want to provide a higher value as an example than one minute, e.g. one hour?



--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 16:38:32 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Riza Suminto (Code Review)
Riza Suminto has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/19530/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19530/2//COMMIT_MSG@9
PS2, Line 9: IMPALA-11658 implements Iceberg manifest caching for Impala. This 
patch
> Nit: "implements Iceberg manifest caching for Impala."
Done


http://gerrit.cloudera.org:8080/#/c/19530/2//COMMIT_MSG@10
PS2, Line 10: adds documentation for configuring the cache(s).
> Nit: "adds documentation for configuring the cache(s).
Done


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml
File docs/topics/impala_iceberg.xml:

http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@614
PS2, Line 614:
> Nit: contents?
Done


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@617
PS2, Line 617: feature can be enabled for Impala by setting properties 
in Hadoop's core-site.xml
> Nit: "as in the following"
Done


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@631
PS2, Line 631: iceberg.io-impl: custom FileIO 
implementation to use in a
> Do we recommend that this not be changed? Maybe we should say that?
Done


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@632
PS2, Line 632: catalog. Must be set to enable manifest caching. 
Impala defaults to
> Nit: "defaults to"
Done


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@639
PS2, Line 639:
> In the two cases above we started a new line after the title (in ),
Removed newline after  in the other two.


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@640
PS2, Line 640: 
iceberg.io.manifest.cache.max-total-bytes: maximum total
> Why would I want to set this to anything other than a high value? Oh. I see
Yes. Added "and lower than 
iceberg.io.manifest.cache.max-total-bytes.".


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@652
PS2, Line 652: of a manifest file to be considered for caching in 
bytes. Manifest files with
> Maybe move this before iceberg.io.manifest.cache.expiration-interval-ms
Yes, reordered the example and bullet points.


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@659
PS2, Line 659:
> Shouldn't it be HadoopCatalogs and HiveCatalogs, i.e. plural?
Done



--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 16:22:53 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 3: Verified+1

Build Successful

https://jenkins.impala.io/job/gerrit-docs-auto-test/703/ : Doc tests passed.


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 16:25:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Riza Suminto (Code Review)
Hello Andrew Sherman, Daniel Becker, Gabor Kaszab, Zoltan Borok-Nagy, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19530

to look at the new patch set (#4).

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..

IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

IMPALA-11658 implements Iceberg manifest caching for Impala. This patch
adds documentation for configuring the cache(s).

Testing:
- Built docs locally

Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
---
M docs/topics/impala_iceberg.xml
1 file changed, 60 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19530/4
--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 4
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 3:

Build Started https://jenkins.impala.io/job/gerrit-docs-auto-test/703/

Testing docs change - this change appears to modify docs/ and no code. This is 
experimental - please report any issues to tarmstr...@cloudera.com or on this 
JIRA: IMPALA-7317


--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 16:17:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Riza Suminto (Code Review)
Hello Andrew Sherman, Daniel Becker, Gabor Kaszab, Zoltan Borok-Nagy, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19530

to look at the new patch set (#3).

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..

IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

IMPALA-11658 implement Iceberg manifest caching config for Impala. This
patch add documentation for that settings.

Testing:
- Built docs locally

Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
---
M docs/topics/impala_iceberg.xml
1 file changed, 60 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/30/19530/3
--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19535 )

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..


Patch Set 1:

(2 comments)

Thanks Quanlong for fixing this.

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@75
PS1, Line 75: StringVal StringFunctions::Substring(FunctionContext* context,
Shouldn't this also be IR_ALWAYS_INLINE? It calls StringFunctions::Substring() 
on L58. In mask-functions-ir.cc we do inline functions that don't directly call 
GetConstFnAttr() but call a function that calls it.

Applies also to Left() and Right().


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@850
PS1, Line 850: IntVal StringFunctions::Instr(FunctionContext* context, const 
StringVal& str,
Shouldn't these Instr overloads, Locate and LocatePos also be IR_ALWAYS_INLINE? 
See comment on L75.



--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 15:26:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19535 )

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12436/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 14:27:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19535 )

Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/mask-functions-ir.cc
File be/src/exprs/mask-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/mask-functions-ir.cc@778
PS1, Line 778: IR_ALWAYS_INLINE StringVal MaskFunctions::Mask(FunctionContext* 
ctx, const StringVal& val) {
line too long (92 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@256
PS1, Line 256: IR_ALWAYS_INLINE IntVal StringFunctions::Length(FunctionContext* 
context, const StringVal& str) {
line too long (97 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@289
PS1, Line 289: IR_ALWAYS_INLINE StringVal 
StringFunctions::Lower(FunctionContext* context, const StringVal& str) {
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@307
PS1, Line 307: IR_ALWAYS_INLINE StringVal 
StringFunctions::Upper(FunctionContext* context, const StringVal& str) {
line too long (99 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@329
PS1, Line 329: IR_ALWAYS_INLINE StringVal 
StringFunctions::InitCap(FunctionContext* context, const StringVal& str) {
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@621
PS1, Line 621: IR_ALWAYS_INLINE StringVal 
StringFunctions::Reverse(FunctionContext* context, const StringVal& str) {
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/19535/1/be/src/exprs/string-functions-ir.cc@790
PS1, Line 790: IR_ALWAYS_INLINE IntVal StringFunctions::Instr(FunctionContext* 
context, const StringVal& str,
line too long (94 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 14:08:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11943: Mark utf8 string functions with IR ALWAYS INLINE

2023-02-24 Thread Quanlong Huang (Code Review)
Quanlong Huang has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19535


Change subject: IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE
..

IMPALA-11943: Mark utf8 string functions with IR_ALWAYS_INLINE

String functions that have both UTF-8 and the traditional ASCII
behaviors have checks for the UTF8_MODE query option. The check is
intended to be replaced with constants during codegen in
LlvmCodeGen::InlineConstFnAttrs().

However, as mentioned in the method comment, InlineConstFnAttrs() only
replaces call instructions inside the current function. To replace the
call to FunctionContextImpl::GetConstFnAttr() inside the callee
functions, we have to inline the callee functions (by annotating them
with IR_ALWAYS_INLINE).

This patch annotates UTF-8 related string functions with
IR_ALWAYS_INLINE to make sure the checks on UTF8_MODE are all replaced
in codegen.

Perf tests:
Ran PERF_STRING queries in targeted-perf (scale factor 100) on parquet
format. Saw significant improvements:
+-+--++-++
| Query   | File Format  | Avg(s) | Base Avg(s) | Delta(Avg) |
+-+--++-++
| PERF_STRING-Q7  | parquet/none | 10.80  | 10.93   |   -1.13%   |
| PERF_STRING-Q6  | parquet/none | 12.25  | 12.55   |   -2.37%   |
| PERF_STRING-Q5  | parquet/none | 5.87   | 6.01|   -2.33%   |
| PERF_STRING-Q3  | parquet/none | 5.02   | 5.26|   -4.47%   |
| PERF_STRING-Q2  | parquet/none | 4.49   | 4.73|   -4.88%   |
| PERF_STRING-Q4  | parquet/none | 4.99   | 5.28| I -5.48%   |
| PERF_STRING-Q1  | parquet/none | 4.02   | 4.29| I -6.34%   |
+-+--++-++
| PERF_STRING-Q13 | parquet/none | 9.90   | 11.48   | I -13.72%  |
| PERF_STRING-Q9  | parquet/none | 10.01  | 11.64   | I -13.97%  |
| PERF_STRING-Q11 | parquet/none | 9.83   | 11.56   | I -14.97%  |
| PERF_STRING-Q10 | parquet/none | 6.62   | 8.07| I -17.89%  |
| PERF_STRING-Q12 | parquet/none | 6.63   | 8.25| I -19.61%  |
| PERF_STRING-Q8  | parquet/none | 5.03   | 6.45| I -22.09%  |
+-+--++-++
Note that Q8-Q13 are new queries added by this patch to reveal the
performance difference.

Ran the same queries comparing the branch of reverting the UTF-8 changes
(IMPALA-2019). Found no regressions anymore. Even see improvement in one
query (Q13):
+-+--++-++
| Query   | File Format  | Avg(s) | Base Avg(s) | Delta(Avg) |
+-+--++-++
| PERF_STRING-Q11 | parquet/none | 9.85   | 9.56|   +3.04%   |
| PERF_STRING-Q7  | parquet/none | 10.76  | 10.59   |   +1.57%   |
| PERF_STRING-Q1  | parquet/none | 4.04   | 4.00|   +0.83%   |
| PERF_STRING-Q4  | parquet/none | 5.02   | 5.00|   +0.51%   |
| PERF_STRING-Q2  | parquet/none | 4.53   | 4.51|   +0.32%   |
| PERF_STRING-Q5  | parquet/none | 5.81   | 5.81|   +0.02%   |
| PERF_STRING-Q8  | parquet/none | 5.02   | 5.04|   -0.31%   |
| PERF_STRING-Q12 | parquet/none | 6.65   | 6.68|   -0.43%   |
| PERF_STRING-Q3  | parquet/none | 5.02   | 5.05|   -0.53%   |
| PERF_STRING-Q6  | parquet/none | 12.23  | 12.33   |   -0.84%   |
| PERF_STRING-Q10 | parquet/none | 6.68   | 6.74|   -0.87%   |
| PERF_STRING-Q9  | parquet/none | 10.01  | 10.41   |   -3.86%   |
| PERF_STRING-Q13 | parquet/none | 9.92   | 10.54   | I -5.85%   |
+-+--++-++

Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
---
M be/src/exprs/mask-functions-ir.cc
M be/src/exprs/string-functions-ir.cc
M testdata/workloads/targeted-perf/queries/string.test
3 files changed, 182 insertions(+), 108 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/19535/1
--
To view, visit http://gerrit.cloudera.org:8080/19535
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I19e8fba332ae329da8b1d37dba3bbc64f59e6f3a
Gerrit-Change-Number: 19535
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 


[Impala-ASF-CR] IMPALA-11940: [DOCS] Document manifest caching settings for Iceberg

2023-02-24 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19530 )

Change subject: IMPALA-11940: [DOCS] Document manifest caching settings for 
Iceberg
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml
File docs/topics/impala_iceberg.xml:

http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@614
PS2, Line 614: content
Nit: contents?


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@639
PS2, Line 639:
In the two cases above we started a new line after the title (in ), 
from here on we don't. It would be better if it was consistent.


http://gerrit.cloudera.org:8080/#/c/19530/2/docs/topics/impala_iceberg.xml@659
PS2, Line 659: HadoopCatalog and HiveCatalog
Shouldn't it be HadoopCatalogs and HiveCatalogs, i.e. plural?



--
To view, visit http://gerrit.cloudera.org:8080/19530
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idd761a81f5c81a25a5ec0889402f85157c23e9fe
Gerrit-Change-Number: 19530
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Fri, 24 Feb 2023 13:36:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11898: Add query options in the profile if the query failed in planning

2023-02-24 Thread Daniel Becker (Code Review)
Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19517 )

Change subject: IMPALA-11898: Add query options in the profile if the query 
failed in planning
..


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG@13
PS1, Line 13: Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
> Sorry, I can't. I tried to add the relevant tests, but couldn't find where
One place where this could go is tests/query_test/test_observability.py. You 
can take 'test_merge_exchange_num_rows()' as an example of how to run a query 
and retrieve its runtime profile.

I'd suggest to use another test query than the one in the Jira because structs 
containing collections are expected to become supported in the near future. We 
could for example try to access a non-existent table to provoke a planning 
failure.


http://gerrit.cloudera.org:8080/#/c/19517/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19517/2//COMMIT_MSG@7
PS2, Line 7: profile if
"... profile even if ..." would be clearer in my opinion.


http://gerrit.cloudera.org:8080/#/c/19517/2//COMMIT_MSG@9
PS2, Line 9: should usually be
"are normally included in the profile" would be better.


http://gerrit.cloudera.org:8080/#/c/19517/2//COMMIT_MSG@10
PS2, Line 10: Query
No need to capitalise "query"; see also the other "Query" on this line.


http://gerrit.cloudera.org:8080/#/c/19517/2//COMMIT_MSG@11
PS2, Line 11: Upon failure, should add query options to the profile.
It could be something like:

"After this change, query options are also added to the profile upon planning 
failure."



--
To view, visit http://gerrit.cloudera.org:8080/19517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
Gerrit-Change-Number: 19517
Gerrit-PatchSet: 2
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 24 Feb 2023 13:23:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19430 )

Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table
..


Patch Set 14: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/19430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316
Gerrit-Change-Number: 19430
Gerrit-PatchSet: 14
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 24 Feb 2023 13:09:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11701 Part1: Don't push down predicates if already applied by Iceberg

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19534 )

Change subject: IMPALA-11701 Part1: Don't push down predicates if already 
applied by Iceberg
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12435/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19534
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfa80ce469cecfcfbcd0dcb595a6b04b7027285b
Gerrit-Change-Number: 19534
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 11:15:40 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11701 Part1: Don't push down predicates if already applied by Iceberg

2023-02-24 Thread Gabor Kaszab (Code Review)
Gabor Kaszab has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19534


Change subject: IMPALA-11701 Part1: Don't push down predicates if already 
applied by Iceberg
..

IMPALA-11701 Part1: Don't push down predicates if already applied by Iceberg

We push down predicates to Iceberg that uses them to filter out files
when getting the results of planFiles(). Using the
FileScanTask.residual() function we can find out if we have to use
the predicates to further filter the rows of the given files or if
Iceberg has already performed all the filtering.
Basically if we only
filter on partition columns then Iceberg can filter the files based
on this and using these filter in Impala wouldn't filter any more
rows from the output (taking into account that no partition evolution)
was performed on the table.

An additional benefit of not down no-op predicates is that we can
potentially materialize less slots. For example:

SELECT count(1) from iceberg_tbl where part_col = 10;

In the above query Iceberg filters the files using the predicate on
a partition column and then there won't be any need to materialize
'part_col' in Impala, nor to push down the 'part_col = 10' predicate.

Change-Id: Icfa80ce469cecfcfbcd0dcb595a6b04b7027285b
---
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/datasets/functional/functional_schema_template.sql
M testdata/datasets/functional/schema_constraints.csv
A 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-predicates.test
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test
12 files changed, 290 insertions(+), 65 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/34/19534/1
--
To view, visit http://gerrit.cloudera.org:8080/19534
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Icfa80ce469cecfcfbcd0dcb595a6b04b7027285b
Gerrit-Change-Number: 19534
Gerrit-PatchSet: 1
Gerrit-Owner: Gabor Kaszab 


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12434/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 09:13:59 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..


Patch Set 1:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12433/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 09:13:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11898: Add query options in the profile if the query failed in planning

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19517 )

Change subject: IMPALA-11898: Add query options in the profile if the query 
failed in planning
..


Patch Set 2:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12432/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
Gerrit-Change-Number: 19517
Gerrit-PatchSet: 2
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 24 Feb 2023 09:05:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..

IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

admission controller was capping memory estimates for the backends to
its physical memory. The memory estimates should instead be capped to
the backend's memory limit for admission, which is computed during
daemon initialization in ExecEnv::Init().

Also fixed the issue related to excessive logging in query profiles
when using global admission controller. If the query was queued
the remote admission controller client was logging 'Queued' status in
profile every time it checked the query status and it hadn't changed.

Testing:
- Updated existing unit tests in admission-controller-test.cc
- Also Running exhaustive tests

Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
---
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/remote-admission-control-client.cc
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
4 files changed, 37 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19533/2
--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 4:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12431/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:53:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19533 )

Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/admission-controller-test.cc
File be/src/scheduling/admission-controller-test.cc:

http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/admission-controller-test.cc@912
PS1, Line 912:   int64_t coordinator_admit_mem_limit = 
schedule_state->GetPerBackendMemLimitForAdmission(true);
line too long (96 > 90)


http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/admission-controller-test.cc@913
PS1, Line 913:   int64_t executor_admit_mem_limit = 
schedule_state->GetPerBackendMemLimitForAdmission(false);
line too long (94 > 90)


http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/schedule-state.cc
File be/src/scheduling/schedule-state.cc:

http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/schedule-state.cc@329
PS1, Line 329:   per_backend_mem_to_admit = min(per_backend_mem_to_admit, 
GetPerBackendMemLimitForAdmission(false));
line too long (101 > 90)


http://gerrit.cloudera.org:8080/#/c/19533/1/be/src/scheduling/schedule-state.cc@330
PS1, Line 330:   coord_backend_mem_to_admit = min(coord_backend_mem_to_admit, 
GetPerBackendMemLimitForAdmission(true));
line too long (104 > 90)



--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:50:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

2023-02-24 Thread Abhishek Rawat (Code Review)
Abhishek Rawat has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19533


Change subject: IMPALA-11858: Cap per backend memory estimate to its memory 
limit for admission
..

IMPALA-11858: Cap per backend memory estimate to its memory limit for admission

admission controller was capping memory estimates for the backends to
its physical memory. The memory estimates should instead be capped to
the backend's memory limit for admission, which is computed during
daemon initialization in ExecEnv::Init().

Also fixed the issue related to excessive logging in query profiles
when using global admission controller. If the query was queued
the remote admission controller client was logging 'Queued' status in
profile every time it checked the query status and it hadn't changed.

Testing:
- Updated existing unit tests in admission-controller-test.cc
- Also Running exhaustive tests

Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
---
M be/src/scheduling/admission-controller-test.cc
M be/src/scheduling/remote-admission-control-client.cc
M be/src/scheduling/schedule-state.cc
M be/src/scheduling/schedule-state.h
4 files changed, 32 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19533/1
--
To view, visit http://gerrit.cloudera.org:8080/19533
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3b1f6e530785ef832dbc831d7cc6793133f3335c
Gerrit-Change-Number: 19533
Gerrit-PatchSet: 1
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-11898: Add query options in the profile if the query failed in planning

2023-02-24 Thread Baike Xia (Code Review)
Baike Xia has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19517 )

Change subject: IMPALA-11898: Add query options in the profile if the query 
failed in planning
..


Patch Set 2:

(3 comments)

Hi Daniel,
Thank you for your reply and suggestions.

http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG@7
PS1, Line 7: i
> Nit: unnecessary double space.
Done


http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG@9
PS1, Line 9: Query options should usually be in the profile,
> Could you clarify the commit message?
Yeah, just like the jira title, query options should usually be in the profile, 
but when the Query fails during planning, Query options are missing.

The RunFrontendPlanner method is used for planning. Failure here should add 
query options. e.g.:
Query Options (set by configuration): TIMEZONE=PRC,CLIENT_IDENTIFIER=impala 
shell build version not available.

If the query fails after planning we should list the query options set by 
configuration and the planner. e.g.:
Query Options (set by configuration): TIMEZONE=PRC,CLIENT_IDENTIFIER=impala 
shell build version not available
Query Options (set by configuration and planner): 
MT_DOP=0,TIMEZONE=PRC,CLIENT_IDENTIFIER=impala shell build version not 
available,MINMAX_FILTER_THRESHOLD=0.5,MINMAX_FILTERING_LEVEL=PAGE


http://gerrit.cloudera.org:8080/#/c/19517/1//COMMIT_MSG@13
PS1, Line 13: Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
> Can we add tests that verify that the query options are included in the pro
Sorry, I can't. I tried to add the relevant tests, but couldn't find where to 
add them. Can you tell me?



--
To view, visit http://gerrit.cloudera.org:8080/19517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
Gerrit-Change-Number: 19517
Gerrit-PatchSet: 2
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:44:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 3:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/12430/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:41:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11898: Add query options in the profile if the query failed in planning

2023-02-24 Thread Baike Xia (Code Review)
Baike Xia has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/19517 )

Change subject: IMPALA-11898: Add query options in the profile if the query 
failed in planning
..

IMPALA-11898: Add query options in the profile if the query failed in planning

Query options should usually be in the profile,
but when the Query fails during planning, Query options are missing.
Upon failure, should add query options to the profile.

Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
---
M be/src/service/client-request-state.cc
M be/src/service/impala-server.cc
2 files changed, 5 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/17/19517/2
--
To view, visit http://gerrit.cloudera.org:8080/19517
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0e9ce62008dd5b1671b09eda5365cbb0940ebe64
Gerrit-Change-Number: 19517
Gerrit-PatchSet: 2
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 2:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/12429/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:37:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19532

to look at the new patch set (#4).

Change subject: IMPALA-11904: Data cache support dumping for reloading
..

IMPALA-11904: Data cache support dumping for reloading

Data cache mainly includes cache metadata and cache files. The cache
files are located on the disk and is responsible for storing cached data
content, while the cache metadata is located in the memory and is
responsible for indexing to the cache file according to the cache key.
Before this patch, if the impalad process exits, the cache metadata will
be lost. After the Impalad process restarts, we cannot reuse the cache
file even though it is still on the disk, because there is no
corresponding cache metadata for index.

This patch implements the dump and load functions of the data cache.
After enabling the dump function with setting
'data_cache_enable_dumping', when the Impalad process is closed by
graceful shutdown (kill -SIGRTMIN $pid), the data cache will collect the
cache metadata and dump them to the location where the cache directory
is located. After enabling the load function with setting
'data_cache_enable_loading', when the Impalad process starts, it will
try to load the dumped files on the disk to restore the original cache
metadata, so that the existing cache files can be reused without
refilling the cache.

The cache can be safely dumped during query execution, because before
the dump starts, the data cache will be set to read-only to prevent the
inconsistency between the metadata dump and the cache file. Note that
the dump files will also use disk space. After testing, the size of the
dump file is generally not more than 0.5% of the size of all cache
files.

Testing:
- Add DataCacheTest,#SetReadOnly
Used to test whether set/revoke read-only takes effect, even when there
are writes in progress.
- Add DataCacheTest,#DumpAndLoad
Used to test whether the original cache contents can be read after a
data cache dump and reload.

Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
---
M CMakeLists.txt
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/scheduling/executor-group.cc
M be/src/service/impala-server.cc
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache.cc
12 files changed, 653 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/19532/4
--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 1:

Build Failed

https://jenkins.impala.io/job/gerrit-code-review-checks/12428/ : Initial code 
review checks failed. See linked job for details on the failure.


--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:27:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19532

to look at the new patch set (#3).

Change subject: IMPALA-11904: Data cache support dumping for reloading
..

IMPALA-11904: Data cache support dumping for reloading

Data cache mainly includes cache metadata and cache files. The cache
files are located on the disk and is responsible for storing cached data
content, while the cache metadata is located in the memory and is
responsible for indexing to the cache file according to the cache key.
Before this patch, if the impalad process exits, the cache metadata will
be lost. After the Impalad process restarts, we cannot reuse the cache
file even though it is still on the disk, because there is no
corresponding cache metadata for index.

This patch implements the dump and load functions of the data cache.
After enabling the dump function with setting
'data_cache_enable_dumping', when the Impalad process is closed by
graceful shutdown (kill -SIGRTMIN $pid), the data cache will collect the
cache metadata and dump them to the location where the cache directory
is located. After enabling the load function with setting
'data_cache_enable_loading', when the Impalad process starts, it will
try to load the dumped files on the disk to restore the original cache
metadata, so that the existing cache files can be reused without
refilling the cache.

The cache can be safely dumped during query execution, because before
the dump starts, the data cache will be set to read-only to prevent the
inconsistency between the metadata dump and the cache file. Note that
the dump files will also use disk space. After testing, the size of the
dump file is generally not more than 0.5% of the size of all cache
files.

Testing:
- Add DataCacheTest,#SetReadOnly
Used to test whether set/revoke read-only takes effect, even when there
are writes in progress.
- Add DataCacheTest,#DumpAndLoad
Used to test whether the original cache contents can be read after a
data cache dump and reload.

Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
---
M CMakeLists.txt
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/scheduling/executor-group.cc
M be/src/service/impala-server.cc
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache.cc
12 files changed, 651 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/19532/3
--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 3
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Anonymous Coward (Code Review)
Hello Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/19532

to look at the new patch set (#2).

Change subject: IMPALA-11904: Data cache support dumping for reloading
..

IMPALA-11904: Data cache support dumping for reloading

Data cache mainly includes cache metadata and cache files. The cache
files are located on the disk and is responsible for storing cached data
content, while the cache metadata is located in the memory and is
responsible for indexing to the cache file according to the cache key.
Before this patch, if the impalad process exits, the cache metadata will
be lost. After the Impalad process restarts, we cannot reuse the cache
file even though it is still on the disk, because there is no
corresponding cache metadata for index.

This patch implements the dump and load functions of the data cache.
After enabling the dump function with setting
'data_cache_enable_dumping', when the Impalad process is closed by
graceful shutdown (kill -SIGRTMIN $pid), the data cache will collect the
cache metadata and dump them to the location where the cache directory
is located. After enabling the load function with setting
'data_cache_enable_loading', when the Impalad process starts, it will
try to load the dumped files on the disk to restore the original cache
metadata, so that the existing cache files can be reused without
refilling the cache.

The cache can be safely dumped during query execution, because before
the dump starts, the data cache will be set to read-only to prevent the
inconsistency between the metadata dump and the cache file. Note that
the dump files will also use disk space. After testing, the size of the
dump file is generally not more than 0.5% of the size of all cache
files.

Testing:
- Add DataCacheTest,#SetReadOnly
Used to test whether set/revoke read-only takes effect, even when there
are writes in progress.
- Add DataCacheTest,#DumpAndLoad
Used to test whether the original cache contents can be read after a
data cache dump and reload.

Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
---
M CMakeLists.txt
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/scheduling/executor-group.cc
M be/src/service/impala-server.cc
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache.cc
12 files changed, 651 insertions(+), 17 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/19532/2
--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19532/2/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19532/2/be/src/runtime/io/data-cache.h@496
PS2, Line 496:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 2
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:19:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-3120: Support Bucket Shuffle Join for bucketed table

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19430 )

Change subject: IMPALA-3120: Support Bucket Shuffle Join for bucketed table
..


Patch Set 14:

Build Successful

https://jenkins.impala.io/job/gerrit-code-review-checks/12427/ : Initial code 
review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun 
to run full precommit tests.


--
To view, visit http://gerrit.cloudera.org:8080/19430
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If321e7987bc88374d79500cffb77ea25b2ed0316
Gerrit-Change-Number: 19430
Gerrit-PatchSet: 14
Gerrit-Owner: Baike Xia 
Gerrit-Reviewer: Aman Sinha 
Gerrit-Reviewer: Baike Xia 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:15:33 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Anonymous Coward (Code Review)
18770832...@163.com has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/19532


Change subject: IMPALA-11904: Data cache support dumping for reloading
..

IMPALA-11904: Data cache support dumping for reloading

Data cache mainly includes cache metadata and cache files. The cache
files are located on the disk and is responsible for storing cached data
content, while the cache metadata is located in the memory and is
responsible for indexing to the cache file according to the cache key.
Before this patch, if the impalad process exits, the cache metadata will
be lost. After the Impalad process restarts, we cannot reuse the cache
file even though it is still on the disk, because there is no
corresponding cache metadata for index.

This patch implements the dump and load functions of the data cache.
After enabling the dump function with setting
'data_cache_enable_dumping', when the Impalad process is closed by
graceful shutdown (kill -SIGRTMIN $pid), the data cache will collect the
cache metadata and dump them to the location where the cache directory
is located. After enabling the load function with setting
'data_cache_enable_loading', when the Impalad process starts, it will
try to load the dumped files on the disk to restore the original cache
metadata, so that the existing cache files can be reused without
refilling the cache.

The cache can be safely dumped during query execution, because before
the dump starts, the data cache will be set to read-only to prevent the
inconsistency between the metadata dump and the cache file. Note that
the dump files will also use disk space. After testing, the size of the
dump file is generally not more than 0.5% of the size of all cache
files.

Testing:
- Add DataCacheTest,#SetReadOnly
Used to test whether set/revoke read-only takes effect, even when there
are writes in progress.
- Add DataCacheTest,#DumpAndLoad
Used to test whether the original cache contents can be read after a
data cache dump and reload.

Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
---
M CMakeLists.txt
M be/src/runtime/io/data-cache-test.cc
M be/src/runtime/io/data-cache.cc
M be/src/runtime/io/data-cache.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/disk-io-mgr.h
M be/src/scheduling/executor-group.cc
M be/src/service/impala-server.cc
M be/src/util/cache/cache-internal.h
M be/src/util/cache/cache.h
M be/src/util/cache/lirs-cache.cc
M be/src/util/cache/rl-cache.cc
12 files changed, 651 insertions(+), 17 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/32/19532/1
--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>


[Impala-ASF-CR] IMPALA-11904: Data cache support dumping for reloading

2023-02-24 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19532 )

Change subject: IMPALA-11904: Data cache support dumping for reloading
..


Patch Set 1:

(12 comments)

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.h
File be/src/runtime/io/data-cache.h:

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.h@496
PS1, Line 496:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc
File be/src/runtime/io/data-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@204
PS1, Line 204: kudu::Env::Default()->NewRWFile(opts, path, 
_file->file_),
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@484
PS1, Line 484: string key; 
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@487
PS1, Line 487: /// cache file. When we dump the cache metadata, the ‘file_’ 
is meaningless because
line too long (91 > 90)


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@683
PS1, Line 683:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@1037
PS1, Line 1037:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/data-cache.cc@1284
PS1, Line 1284:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/disk-io-mgr.h
File be/src/runtime/io/disk-io-mgr.h:

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/disk-io-mgr.h@417
PS1, Line 417:   /// Try to dump the data of remote data cache to disk, so it 
could be loaded when
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/disk-io-mgr.cc
File be/src/runtime/io/disk-io-mgr.cc:

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/runtime/io/disk-io-mgr.cc@78
PS1, Line 78: DEFINE_bool(data_cache_enable_dumping, false,
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/util/cache/lirs-cache.cc@1086
PS1, Line 1086:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/util/cache/lirs-cache.cc@1105
PS1, Line 1105:
line has trailing whitespace


http://gerrit.cloudera.org:8080/#/c/19532/1/be/src/util/cache/lirs-cache.cc@1125
PS1, Line 1125:
line has trailing whitespace



--
To view, visit http://gerrit.cloudera.org:8080/19532
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id867f4fc7343898e4906332c3caa40eb57a03101
Gerrit-Change-Number: 19532
Gerrit-PatchSet: 1
Gerrit-Owner: Anonymous Coward <18770832...@163.com>
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 24 Feb 2023 08:09:24 +
Gerrit-HasComments: Yes