[Impala-ASF-CR] IMPALA-13031: Enhancing logging for spilling configuration with local buffer directory details

2024-05-01 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21350 )

Change subject: IMPALA-13031: Enhancing logging for spilling configuration with 
local buffer directory details
..


Patch Set 1:

Thanks wenzhe for reviewing it.


--
To view, visit http://gerrit.cloudera.org:8080/21350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8fb357016d72a363ee5016f7881b0f6b0426aff5
Gerrit-Change-Number: 21350
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 01 May 2024 23:27:36 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13031: Enhancing logging for spilling configuration with local buffer directory details

2024-04-23 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21350


Change subject: IMPALA-13031: Enhancing logging for spilling configuration with 
local buffer directory details
..

IMPALA-13031: Enhancing logging for spilling configuration with local buffer 
directory details

The patch adds logging for local buffer directory when using
remote scratch space. The printed log would be like
"Using local buffer directory for scratch space
/tmp/test/impala-scratch on disk 8 limit: 500.00 MB,
priority: 2147483647".

Tests:
Manally tests the logging working as described.

Change-Id: I8fb357016d72a363ee5016f7881b0f6b0426aff5
---
M be/src/runtime/tmp-file-mgr-internal.h
M be/src/runtime/tmp-file-mgr.cc
2 files changed, 14 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/50/21350/1
--
To view, visit http://gerrit.cloudera.org:8080/21350
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8fb357016d72a363ee5016f7881b0f6b0426aff5
Gerrit-Change-Number: 21350
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-13015: Dataload fails due to concurrency issue with test.jceks

2024-04-22 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21346 )

Change subject: IMPALA-13015: Dataload fails due to concurrency issue with 
test.jceks
..


Patch Set 3: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7fbeffc19f2b78c19fee9acf7f96466c8f4f9bcd
Gerrit-Change-Number: 21346
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 23 Apr 2024 02:28:02 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-13015: Dataload fails due to concurrency issue with test.jceks

2024-04-22 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21346 )

Change subject: IMPALA-13015: Dataload fails due to concurrency issue with 
test.jceks
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21346/2/testdata/bin/create-load-data.sh
File testdata/bin/create-load-data.sh:

http://gerrit.cloudera.org:8080/#/c/21346/2/testdata/bin/create-load-data.sh@555
PS2, Line 555: run-step "Creating hadoop credential" 
create-hadoop-credential.log \
How about adding a comment here to explain why we place the command here and 
the potential concurrency issue with the Jira number?



--
To view, visit http://gerrit.cloudera.org:8080/21346
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7fbeffc19f2b78c19fee9acf7f96466c8f4f9bcd
Gerrit-Change-Number: 21346
Gerrit-PatchSet: 2
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 23 Apr 2024 00:28:23 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

2024-04-18 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21315 )

Change subject: IMPALA-13004: Fix heap-use-after-free error in ExprTest 
AiFunctionsTest
..


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21315/2/be/src/exprs/ai-functions.inline.h
File be/src/exprs/ai-functions.inline.h:

http://gerrit.cloudera.org:8080/#/c/21315/2/be/src/exprs/ai-functions.inline.h@108
PS2, Line 108: plac
> 'Move' is good from the context of the change, but if someone is reading th
Done


http://gerrit.cloudera.org:8080/#/c/21315/2/be/src/exprs/ai-functions.inline.h@108
PS2, Line 108:
> Nit: it is not a loop, I wrote it wrong in my comment. "'if' statement" wou
Done


http://gerrit.cloudera.org:8080/#/c/21315/2/be/src/exprs/ai-functions.inline.h@178
PS2, Line 178:   std::string response = AiGenerateTextParseOpenAiResponse(
> The other alternative would've been to create rapid::json Document and pass
Agree. Preferring string for now



--
To view, visit http://gerrit.cloudera.org:8080/21315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
Gerrit-Change-Number: 21315
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 18 Apr 2024 13:38:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

2024-04-18 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21315 )

Change subject: IMPALA-13004: Fix heap-use-after-free error in ExprTest 
AiFunctionsTest
..

IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

The issue is that the code previously used a std::string_view to
hold the data which is actually returned by rapidjson::Document.
However, the rapidjson::Document object gets destroyed after
creating the std::string_view. This meant the std::string_view
referenced memory that was no longer valid, leading to a
heap-use-after-free error.

This patch fixes this issue by modifying the function to
return a std::string instead of a std::string_view. When the
function returns a string, it creates a copy of the
data from rapidjson::Document. This ensures the returned
string has its own memory allocation and doesn't rely on
the destroyed rapidjson::Document.

Tests:
Reran the asan build and passed.

Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
---
M be/src/exprs/ai-functions-ir.cc
M be/src/exprs/ai-functions.h
M be/src/exprs/ai-functions.inline.h
M be/src/exprs/expr-test.cc
4 files changed, 11 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/21315/3
--
To view, visit http://gerrit.cloudera.org:8080/21315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
Gerrit-Change-Number: 21315
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore in the web debug endpoint

2024-04-18 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21294 )

Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore in the web debug endpoint
..


Patch Set 4:

Irrelevant Iceberg issue: IMPALA-12621


--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 18 Apr 2024 13:21:31 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore in the web debug endpoint

2024-04-17 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21294 )

Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore in the web debug endpoint
..


Patch Set 3:

(3 comments)

Thanks for the review. This patch fixes previous feedbacks, and combines the 
logic of adding status to a function AddActiveStatus() in 
default-path-handlers.cc.

http://gerrit.cloudera.org:8080/#/c/21294/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21294/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-12874: Identify active and standby catalog and statestore 
in the web debug endpoint
> Incomplete sentence:
Done


http://gerrit.cloudera.org:8080/#/c/21294/2//COMMIT_MSG@20
PS2, Line 20: Manually tests the web page, and verified the status display is
: correct.
> If there is a failover does the status get updated? Can we also test that?
Tested and worked. Added in the commit message.


http://gerrit.cloudera.org:8080/#/c/21294/2/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21294/2/be/src/util/default-path-handlers.cc@271
PS2, Line 271:
> admissiond also uses DaemonEnv, we should handle that case also. Maybe exit
Done



--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 17 Apr 2024 18:18:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore in the web debug endpoint

2024-04-17 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21294 )

Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore in the web debug endpoint
..

IMPALA-12874: Identify active and standby catalog and statestore in the web 
debug endpoint

This patch adds support to display the HA status of catalog and
statestore on the root web page. The status will be presented
as "Catalog Status: Active" or "Statestore Status: Standby"
based on the values retrieved from the metrics
catalogd-server.active-status and statestore.active-status.

If the catalog or statestore is standalone, it will show active as
the status, which is same as the metric.

Tests:
Ran core tests.
Manually tests the web page, and verified the status display is
correct. Also checked the situation when the failover happens,
the current 'standby' status can be changed to 'active'.

Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
---
M be/src/common/daemon-env.h
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M www/root.tmpl
4 files changed, 69 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/21294/3
--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

2024-04-17 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21315 )

Change subject: IMPALA-13004: Fix heap-use-after-free error in ExprTest 
AiFunctionsTest
..


Patch Set 2:

(2 comments)

Thanks Daniel for the review.

http://gerrit.cloudera.org:8080/#/c/21315/1/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21315/1/be/src/exprs/ai-functions-ir.cc@88
PS1, Line 88: string
> Nit: we could keep the "std::" prefix, it is also used at L81 and L74.
Thanks for pointing this out. But I would prefer removing unnecessary prefixes 
like line 49.


http://gerrit.cloudera.org:8080/#/c/21315/1/be/src/exprs/ai-functions.inline.h
File be/src/exprs/ai-functions.inline.h:

http://gerrit.cloudera.org:8080/#/c/21315/1/be/src/exprs/ai-functions.inline.h@107
PS1, Line 107:   // Caution: 'payload' might reference data owned by 
'overrides'.
> Is it intentional that 'overrides' is taken out of the loop? Is it because
Yes. Added a comment here.



--
To view, visit http://gerrit.cloudera.org:8080/21315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
Gerrit-Change-Number: 21315
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 17 Apr 2024 16:27:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

2024-04-17 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21315 )

Change subject: IMPALA-13004: Fix heap-use-after-free error in ExprTest 
AiFunctionsTest
..

IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

The issue is that the code previously used a std::string_view to
hold the data which is actually returned by rapidjson::Document.
However, the rapidjson::Document object gets destroyed after
creating the std::string_view. This meant the std::string_view
referenced memory that was no longer valid, leading to a
heap-use-after-free error.

This patch fixes this issue by modifying the function to
return a std::string instead of a std::string_view. When the
function returns a string, it creates a copy of the
data from rapidjson::Document. This ensures the returned
string has its own memory allocation and doesn't rely on
the destroyed rapidjson::Document.

Tests:
Reran the asan build and passed.

Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
---
M be/src/exprs/ai-functions-ir.cc
M be/src/exprs/ai-functions.h
M be/src/exprs/ai-functions.inline.h
M be/src/exprs/expr-test.cc
4 files changed, 11 insertions(+), 9 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/21315/2
--
To view, visit http://gerrit.cloudera.org:8080/21315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
Gerrit-Change-Number: 21315
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

2024-04-16 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21315


Change subject: IMPALA-13004: Fix heap-use-after-free error in ExprTest 
AiFunctionsTest
..

IMPALA-13004: Fix heap-use-after-free error in ExprTest AiFunctionsTest

The issue is that the code previously used a std::string_view to
hold the data which is actually returned by rapidjson::Document.
However, the rapidjson::Document object gets destroyed after
creating the std::string_view. This meant the std::string_view
referenced memory that was no longer valid, leading to a
heap-use-after-free error.

This patch fixes this issue by modifying the function to
return a std::string instead of a std::string_view. When the
function returns a string, it creates a copy of the
data from rapidjson::Document. This ensures the returned
string has its own memory allocation and doesn't rely on
the destroyed rapidjson::Document.

Tests:
Reran the asan build and passed.

Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
---
M be/src/exprs/ai-functions-ir.cc
M be/src/exprs/ai-functions.h
M be/src/exprs/ai-functions.inline.h
M be/src/exprs/expr-test.cc
4 files changed, 7 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/21315/1
--
To view, visit http://gerrit.cloudera.org:8080/21315
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3bb9dcf9d72cce7ad37d5bc25821cf6ee55a8ab5
Gerrit-Change-Number: 21315
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 


[Impala-ASF-CR] WIP: IMPALA-12046: Add profile counter for scan range queueing time on disk queues

2024-04-16 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21309


Change subject: WIP: IMPALA-12046: Add profile counter for scan range queueing 
time on disk queues
..

WIP: IMPALA-12046: Add profile counter for scan range queueing time on disk 
queues

This patch introduces a new timer counter in the query profile
to track the time spent by scan ranges in the disk io queues.
The solution is to add a timer to each request range and
starting and stopping the timer when the request range is
in the disk io queue. Each request context will have a
queueing timer counter, and every timer associated with
the range belonging to the request context will aggregate
the time count to this counter.

Also adds Reset() function to existing timer to enable
fast resetting of the timer. Since ranges are now reusable,
it is more efficient to reset the timer instead of
reallocating it every time the range is reused.

Tests:
Manuall tests and it shows the aggregated queueing time in
the profile.
TODO: Test the performance impact and add tests.

Change-Id: Ia0cb24db36cd89933c150b88e166e68a3abc7a60
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/runtime/io/disk-io-mgr.cc
M be/src/runtime/io/request-context.cc
M be/src/runtime/io/request-context.h
M be/src/runtime/io/request-ranges.h
M be/src/util/runtime-profile-counters.h
M be/src/util/stopwatch.h
8 files changed, 90 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/09/21309/1
--
To view, visit http://gerrit.cloudera.org:8080/21309
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia0cb24db36cd89933c150b88e166e68a3abc7a60
Gerrit-Change-Number: 21309
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore the web debug endpoint

2024-04-15 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21294 )

Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore the web debug endpoint
..


Patch Set 2:

(2 comments)

Thanks Wenzhe for reviewing.

http://gerrit.cloudera.org:8080/#/c/21294/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21294/1//COMMIT_MSG@15
PS1, Line 15: the
: status as
> nit: status as active
Done


http://gerrit.cloudera.org:8080/#/c/21294/1/be/src/util/default-path-handlers.cc
File be/src/util/default-path-handlers.cc:

http://gerrit.cloudera.org:8080/#/c/21294/1/be/src/util/default-path-handlers.cc@283
PS1, Line 283: if (metric->GetValue()) {
 :   document->AddMember(
 :   "catalogd_active_status", "Active", 
document->GetAllocator());
 : } else {
 :   document->AddMember(
 :   "catalogd_active_status", "Standby", 
document->GetAllocator());
 : }
> nit: this can be simplified as
Tried, but the initializer of rapidjson::GenericValue seems not working this 
way..



--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Apr 2024 17:06:16 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore the web debug endpoint

2024-04-15 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/21294 )

Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore the web debug endpoint
..

IMPALA-12874: Identify active and standby catalog and statestore the web debug 
endpoint

This patch adds support to display the HA status of catalog and
statestore on the root web page. The status will be presented
as "Catalog Status: Active" or "Statestore Status: Standby"
based on the values retrieved from the metrics
catalogd-server.active-status and statestore.active-status.

If the catalog or statestore is standalone, it will show the
status as active, which is same as the metric.

Tests:
Ran core tests.
Manually tests the web page, and verified the status display is
correct.

Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
---
M be/src/common/daemon-env.h
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M www/root.tmpl
4 files changed, 61 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/21294/2
--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12874: Identify active and standby catalog and statestore the web debug endpoint

2024-04-12 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21294


Change subject: IMPALA-12874: Identify active and standby catalog and 
statestore the web debug endpoint
..

IMPALA-12874: Identify active and standby catalog and statestore the web debug 
endpoint

This patch adds support to display the HA status of catalog and
statestore on the root web page. The status will be presented
as "Catalog Status: Active" or "Statestore Status: Standby"
based on the values retrieved from the metrics
catalogd-server.active-status and statestore.active-status.

If the catalog or statestore is standalone, it will show active as
the status, which is same as the metric.

Tests:
Ran core tests.
Manually tests the web page, and verified the status display is
correct.

Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
---
M be/src/common/daemon-env.h
M be/src/util/default-path-handlers.cc
M be/src/util/default-path-handlers.h
M www/root.tmpl
4 files changed, 61 insertions(+), 6 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/94/21294/1
--
To view, visit http://gerrit.cloudera.org:8080/21294
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie9435ba7a9549ea56f9d080a9315aecbcc630cd2
Gerrit-Change-Number: 21294
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-12963: Return parent PID when children spawned

2024-04-10 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21278 )

Change subject: IMPALA-12963: Return parent PID when children spawned
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/21278
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I214e79507c717340863d27f68f6ea54c169e4090
Gerrit-Change-Number: 21278
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 10 Apr 2024 16:58:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-10 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 16: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21168/16/be/src/exprs/ai-functions.h
File be/src/exprs/ai-functions.h:

http://gerrit.cloudera.org:8080/#/c/21168/16/be/src/exprs/ai-functions.h@18
PS16, Line 18: #ifndef IMPALA_EXPRS_AI_FUNCTIONS_H
 : #define IMPALA_EXPRS_AI_FUNCTIONS_H
nit. #pragma once seems simpler


http://gerrit.cloudera.org:8080/#/c/21168/16/be/src/exprs/ai-functions.inline.h
File be/src/exprs/ai-functions.inline.h:

http://gerrit.cloudera.org:8080/#/c/21168/16/be/src/exprs/ai-functions.inline.h@18
PS16, Line 18: #ifndef IMPALA_EXPRS_AI_FUNCTIONS_INLINE_H
 : #define IMPALA_EXPRS_AI_FUNCTIONS_INLINE_H
nit. #pragma once seems simpler, but not a big deal



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 16
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 10 Apr 2024 15:41:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12905: Disk-based tuple caching

2024-04-08 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21171 )

Change subject: IMPALA-12905: Disk-based tuple caching
..


Patch Set 9:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21171/9/be/src/exec/tuple-file-writer.h
File be/src/exec/tuple-file-writer.h:

http://gerrit.cloudera.org:8080/#/c/21171/9/be/src/exec/tuple-file-writer.h@46
PS9, Line 46: wtihout
nit. without


http://gerrit.cloudera.org:8080/#/c/21171/9/be/src/runtime/tuple-cache-mgr.h
File be/src/runtime/tuple-cache-mgr.h:

http://gerrit.cloudera.org:8080/#/c/21171/9/be/src/runtime/tuple-cache-mgr.h@110
PS9, Line 110: increament
nit. increment



-- 
To view, visit http://gerrit.cloudera.org:8080/21171
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I13a65c4c0559cad3559d5f714a074dd06e9cc9bf
Gerrit-Change-Number: 21171
Gerrit-PatchSet: 9
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 08 Apr 2024 22:28:07 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..


Patch Set 1:

Thanks Riza and Wenzhe for the review.


--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Apr 2024 15:27:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-04 Thread Yida Wu (Code Review)
Yida Wu has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21235 )

Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..

IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

This patch fixes a bug in the RowsPassedThrough metric within the
query profile while using Streaming Aggregation. The issue is from
the AddBatchStreaming() function's logic, where the number of rows
in the output batch isn't necessarily initialized to 0, while the
function uses num_rows() of the output batch directly to be the
actual number of rows returned and passed through of this specific
aggregator. This discrepancy can significantly impact the accuracy
of the returned and passed through numbers, as well as the
calculation of reduction rates during hash table expansion in
Streaming Aggregation. Huge differences can be observed especially
when using the rollup function.

The solution is to calculate the actual number of rows added
to the output batch within each round of the AddBatchStreaming()
function.

Tests:
Passed exhaustive tests.
Added a corresponding case in tpch-passthrough-aggregations.test.

Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Reviewed-on: http://gerrit.cloudera.org:8080/21235
Reviewed-by: Wenzhe Zhou 
Reviewed-by: Riza Suminto 
Tested-by: Impala Public Jenkins 
---
M be/src/exec/grouping-aggregator.cc
M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test
2 files changed, 27 insertions(+), 2 deletions(-)

Approvals:
  Wenzhe Zhou: Looks good to me, but someone else must approve
  Riza Suminto: Looks good to me, approved
  Impala Public Jenkins: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

2024-04-03 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21235


Change subject: IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in 
Streaming Aggregation
..

IMPALA-12960: Fix Incorrect RowsPassedThrough Metric in Streaming Aggregation

This patch fixes a bug in the RowsPassedThrough metric within the
query profile while using Streaming Aggregation. The issue is from
the AddBatchStreaming() function's logic, where the number of rows
in the output batch isn't necessarily initialized to 0, while the
function uses num_rows() of the output batch directly to be the
actual number of rows returned and passed through of this specific
aggregator. This discrepancy can significantly impact the accuracy
of the returned and passed through numbers, as well as the
calculation of reduction rates during hash table expansion in
Streaming Aggregation. Huge differences can be observed especially
when using the rollup function.

The solution is to calculate the actual number of rows added
to the output batch within each round of the AddBatchStreaming()
function.

Tests:
Passed exhaustive tests.
Added a corresponding case in tpch-passthrough-aggregations.test.

Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
---
M be/src/exec/grouping-aggregator.cc
M testdata/workloads/tpch/queries/tpch-passthrough-aggregations.test
2 files changed, 27 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/35/21235/1
--
To view, visit http://gerrit.cloudera.org:8080/21235
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I59205a4b06824ee1607a25e906db1f96dc4eda9f
Gerrit-Change-Number: 21235
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 6: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@83
PS5, Line 83:   != nullptr);
> This was suggested by clang to improve readability, I think. I'm inclined t
Ack


http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@251
PS5, Line 251: response
> Some of that would be controlled by parameters such as max_tokens that you
That makes sense. Ack.



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 6
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 03 Apr 2024 15:05:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-04-01 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 5: Code-Review+1

(4 comments)

LGTM! Some minor comments.

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@83
PS5, Line 83:   != nullptr);
nit. seems we can save one line here because above line has enough space


http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@225
PS5, Line 225: kudu::EasyCurl curl;
 :   
curl.set_timeout(kudu::MonoDelta::FromSeconds(FLAGS_ai_connection_timeout_s));
 :   curl.set_fail_on_http_error(true);
 :   kudu::faststring resp;
 :   kudu::Status status = curl.PostToURL(endpoint_str, 
payload_str, , headers);
 :   VLOG(2) << "AI Generate Text: \noriginal response: " << 
resp.ToString();
 :   if (!status.ok()) {
 : string msg = status.ToString();
 : return StringVal::CopyFrom(
 : ctx, reinterpret_cast(msg.c_str()), 
msg.size());
 :   }
Can we add a summary comment about what this part of code is doing? I think it 
does the real thing to contact with the Ai endpoint, and can be critical to 
performance, maybe a comment can make it clearer for others to understand its 
importance.


http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@236
PS5, Line 236: JSON string
nit. how about saying "response JSON string"


http://gerrit.cloudera.org:8080/#/c/21168/5/be/src/exprs/ai-functions-ir.cc@251
PS5, Line 251: response
Thinking about whether we need to worry about the case when the length of 
response is very large or unexpected large. How about we print a warning log if 
the response is larger than certain size?



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 5
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 01 Apr 2024 22:19:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4545: Simplify test dimension in test decimal casting.py

2024-03-22 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21174 )

Change subject: IMPALA-4545: Simplify test dimension in test_decimal_casting.py
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21174
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe269e08a955097ad9e924d5d64b42438ad15be2
Gerrit-Change-Number: 21174
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 22 Mar 2024 21:24:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12920: Support ai generate text built-in function for OpenAI's chat completion API

2024-03-21 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21168 )

Change subject: IMPALA-12920: Support ai_generate_text built-in function for 
OpenAI's chat completion API
..


Patch Set 3:

(8 comments)

Thanks for bringing this up! It looks good overall. I noticed some formatting 
issues in the file ai-functions-ir.cc, could you please rerun clang-format on 
that file?

http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc
File be/src/exprs/ai-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@74
PS3, Line 74: https://
Maybe declare a const string for "https://;, we can also use size of it instead 
of strlen every time.


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@75
PS3, Line 75: 
no need spaces


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@79
PS3, Line 79: openai.azure.com
Maybe declare a const string for it? Also same to "api.openai.com"


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@81
PS3, Line 81:
no need spaces


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@85
PS3, Line 85: choices
Thinking may be good the have const strings for all the json field names, 
easier to maintain the field name later


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@122
PS3, Line 122:
should be two spaces


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/ai-functions-ir.cc@196
PS3, Line 196: std::
doesn't need the namespace


http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/21168/3/be/src/exprs/string-functions-ir.cc@1950
PS3, Line 1950: }
Seems the change is unnecessary.



--
To view, visit http://gerrit.cloudera.org:8080/21168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id4446957f6030bab1f985fdd69185c3da07d7c4b
Gerrit-Change-Number: 21168
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 22 Mar 2024 01:14:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4545: Simplify test dimension in test decimal casting.py

2024-03-21 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21174 )

Change subject: IMPALA-4545: Simplify test dimension in test_decimal_casting.py
..


Patch Set 3: Code-Review+1

LGTM! I can upgrade to +2 if no one else comments


--
To view, visit http://gerrit.cloudera.org:8080/21174
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe269e08a955097ad9e924d5d64b42438ad15be2
Gerrit-Change-Number: 21174
Gerrit-PatchSet: 3
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 21 Mar 2024 18:14:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-4545: Simplify test dimension in test decimal casting.py

2024-03-21 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21174 )

Change subject: IMPALA-4545: Simplify test dimension in test_decimal_casting.py
..


Patch Set 2: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21174/1/tests/query_test/test_decimal_casting.py
File tests/query_test/test_decimal_casting.py:

http://gerrit.cloudera.org:8080/#/c/21174/1/tests/query_test/test_decimal_casting.py@54
PS1, Line 54: Core/pairwise only deals with precision
> Yes, core is permuted the same as pairwise now.
Can you also mention it in the commit message? Because the testing behavior for 
"pairwise" is changed.



--
To view, visit http://gerrit.cloudera.org:8080/21174
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe269e08a955097ad9e924d5d64b42438ad15be2
Gerrit-Change-Number: 21174
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 21 Mar 2024 15:47:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-4545: Simplify test dimension in test decimal casting.py

2024-03-20 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21174 )

Change subject: IMPALA-4545: Simplify test dimension in test_decimal_casting.py
..


Patch Set 1: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21174/1/tests/query_test/test_decimal_casting.py
File tests/query_test/test_decimal_casting.py:

http://gerrit.cloudera.org:8080/#/c/21174/1/tests/query_test/test_decimal_casting.py@54
PS1, Line 54: Core only deals with precision 6, 16, 26
Not familiar with the "pairwise" mode, is it the same as the core now?


http://gerrit.cloudera.org:8080/#/c/21174/1/tests/query_test/test_decimal_casting.py@109
PS1, Line 109: # precision, scale = vector.get_value('decimal_type')
This line can be removed?



--
To view, visit http://gerrit.cloudera.org:8080/21174
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe269e08a955097ad9e924d5d64b42438ad15be2
Gerrit-Change-Number: 21174
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 21 Mar 2024 01:47:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-19 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 7: Code-Review+2

Carry Michael's +1.


--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 7
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 19 Mar 2024 17:16:09 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12883: Support updating the charge on an entry in the cache

2024-03-14 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21122 )

Change subject: IMPALA-12883: Support updating the charge on an entry in the 
cache
..


Patch Set 5: Code-Review+1

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc
File be/src/util/cache/lirs-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@274
PS5, Line 274:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@424
PS5, Line 424:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache-test.cc@455
PS5, Line 455:
nit. unnecessary new line


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache.cc
File be/src/util/cache/lirs-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/lirs-cache.cc@990
PS5, Line 990:   CHECK(false);
Maybe add a logging for unexpected state?
CHECK(false) << "Unexpected state: " << e->state();


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc
File be/src/util/cache/rl-cache.cc:

http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc@324
PS5, Line 324: while (usage_ > capacity_ && rl_.next != _) {
 :   RLHandle* old = rl_.next;
 :   RL_Remove(old);
 :   table_.Remove(old->key(), old->hash());
 :   if (Unref(old)) {
 : old->next = to_remove_head;
 : to_remove_head = old;
 :   }
 : }
This section of code is identical to a part of RLCacheShard::Insert. 
Could we consider extracting this segment into a separate function for it to 
reduce redundancy?


http://gerrit.cloudera.org:8080/#/c/21122/5/be/src/util/cache/rl-cache.cc@336
PS5, Line 336:   while (to_remove_head != nullptr) {
 : RLHandle* next = to_remove_head->next;
 : FreeEntry(to_remove_head);
 : to_remove_head = next;
 :   }
Same as above



--
To view, visit http://gerrit.cloudera.org:8080/21122
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I34f54fb3a91a77821651c25d8d3bc3a2a3945025
Gerrit-Change-Number: 21122
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 14 Mar 2024 18:03:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-03-12 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 5: Code-Review+1

(3 comments)

Looks good for an initial patch. Could be +2 if no other major issues

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@530
PS3, Line 530:  have not been calcula
> Refactored this to be a ThriftSerializationCtx that is passed in. This code
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1237
PS3, Line 1237:
> Reworked this comment to describe how this is a bottom-up tree traversal.
Done


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java
File fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java@59
PS3, Line 59:
> Changed this to use StringBuilder. I also added an explicit finalize() call
Done



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 5
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Mar 2024 17:14:22 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-02-27 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc@28
PS3, Line 28: allow_tuple_caching
> The way I see it working is:
Agreed. In this case, I think including a DCHECK(FLAGS_allow_tuple_caching) 
within TupleCacheNode::TupleCacheNode or TupleCacheNode::Open would be a 
beneficial assertion.



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 27 Feb 2024 23:43:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-02-27 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc@28
PS3, Line 28: allow_tuple_caching
> Used in query-options.cc.
Does that mean the allow_tuple_caching is always true when it comes to the 
class TupleCacheNode so that we don't need to care about this flag in 
TupleCacheNode? If that is the case, maybe we can add a DCHECK in the class


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java
File fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java@59
PS3, Line 59: // TODO: incorporate column information into the key
: // verifyDifferentCacheKeys("select id from 
functional.alltypes",
: // "sele
> I'm less concerned about adding commented test cases that are part of a TOD
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 27 Feb 2024 21:00:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12818: Intermediate Result Caching plan node framework

2024-02-25 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21035 )

Change subject: IMPALA-12818: Intermediate Result Caching plan node framework
..


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc
File be/src/exec/tuple-cache-node.cc:

http://gerrit.cloudera.org:8080/#/c/21035/3/be/src/exec/tuple-cache-node.cc@28
PS3, Line 28: allow_tuple_caching
It seems the flag is not used in the class, or do I miss something?


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java
File fe/src/main/java/org/apache/impala/planner/PlanNode.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@530
PS3, Line 530: tupleCacheInfo == null
Could you please add a comment to describe when or why the tupleCacheInfo is 
null


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1237
PS3, Line 1237: recurse to children
For clarity, maybe it is better to have a full sentence like, "Recursively 
compute tuple cache info for children nodes"


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/PlanNode.java@1251
PS3, Line 1251: TODO
Would be good if we have a Jira to follow the TODOs


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java
File fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java@59
PS3, Line 59: String
May consider StringBuilder for hashTrace_ due to string concatenations in 
mergeChild(), StringBuilder could be more efficient


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCacheInfo.java@94
PS3, Line 94: TODO
Would be good if we have a Jira to follow the TODOs


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCachePlanner.java
File fe/src/main/java/org/apache/impala/planner/TupleCachePlanner.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/main/java/org/apache/impala/planner/TupleCachePlanner.java@75
PS3, Line 75: TODO
Would be good if we have a Jira to follow the TODOs


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java
File fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java:

http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java@59
PS3, Line 59: // TODO: incorporate column information into the key
: // verifyDifferentCacheKeys("select id from 
functional.alltypes",
: // "sele
May simply remove the case from the testcase for the commit if we don't support 
it yet


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java@69
PS3, Line 69: TODO
same as above


http://gerrit.cloudera.org:8080/#/c/21035/3/fe/src/test/java/org/apache/impala/planner/TupleCacheTest.java@79
PS3, Line 79: // TODO: Random functions should make a location ineligible
same as above



--
To view, visit http://gerrit.cloudera.org:8080/21035
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia1f36a87dcce6efd5d1e1f0bc04009bf009b1961
Gerrit-Change-Number: 21035
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Kurt Deschler 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 26 Feb 2024 02:35:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12828: Remove Usage of Unnecessary this->

2024-02-21 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21047 )

Change subject: IMPALA-12828: Remove Usage of Unnecessary this->
..


Patch Set 3: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21047/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21047/3//COMMIT_MSG@9
PS3, Line 9: 408c606
nit. include the related Jira ticket IMPALA-12426 in the commit message might 
be beneficial.



--
To view, visit http://gerrit.cloudera.org:8080/21047
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia6b1d8ba7e27d20ba1ac83caff5e56fd0f9347c0
Gerrit-Change-Number: 21047
Gerrit-PatchSet: 3
Gerrit-Owner: Jason Fehr 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Jason Fehr 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 21 Feb 2024 22:25:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12790: Fix overestimation in ScanNode.getInputCardinality

2024-02-13 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21005 )

Change subject: IMPALA-12790: Fix overestimation in ScanNode.getInputCardinality
..


Patch Set 9: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21005/9/tests/query_test/test_codegen.py
File tests/query_test/test_codegen.py:

http://gerrit.cloudera.org:8080/#/c/21005/9/tests/query_test/test_codegen.py@51
PS9, Line 51: {'disable_codegen_rows_threshold': 7000}
> disable_codegen_rows_threshold default value is 5.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/21005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icc5b39a7684fb8748185349d0b80baf8dcd6b126
Gerrit-Change-Number: 21005
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 14 Feb 2024 00:58:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12790: Fix overestimation in ScanNode.getInputCardinality

2024-02-11 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21005 )

Change subject: IMPALA-12790: Fix overestimation in ScanNode.getInputCardinality
..


Patch Set 6: Code-Review+1

(2 comments)

It looks good to me, just minor things.

http://gerrit.cloudera.org:8080/#/c/21005/6/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/21005/6/fe/src/main/java/org/apache/impala/planner/ScanNode.java@336
PS6, Line 336: if (hasSimpleLimit()) {
It would be good to add comments describing the high-level logic of the function


http://gerrit.cloudera.org:8080/#/c/21005/6/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java
File fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java:

http://gerrit.cloudera.org:8080/#/c/21005/6/fe/src/main/java/org/apache/impala/util/MaxRowsProcessedVisitor.java@54
PS6, Line 54: hasSimpleLimit
Would it be simpler to use scan.hasSimpleLimit() directly since hasSimpleLimit 
is not used anywhere else in the code



--
To view, visit http://gerrit.cloudera.org:8080/21005
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icc5b39a7684fb8748185349d0b80baf8dcd6b126
Gerrit-Change-Number: 21005
Gerrit-PatchSet: 6
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 12 Feb 2024 01:16:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12721: Fix flaky tests involving check deleted file fd()

2024-02-06 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/21000 )

Change subject: IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()
..

IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()

check_deleted_file_fd() is introduced in IMPALA-12681, however some
spilling testcases involving check_deleted_file_fd() seem flaky.

This patch fixed the issue by adding a retry mechanism within the
check_deleted_file_fd() function. If the function encounters a
failure, it retries the process of verifying the presence of a
deleted referencing file. Based on my local test, the file will be
removed after the test even when the test fails and the call to
delete the file handle is ahead of the call to remove the file (This
has been confirmed through additional testing logs). While there is
no theory why this would happen, introducing a retry mechanism has
allowed the test case to run successfully for 200 times without
encountering any failures. It is possible that a delay may be
occurring at some point in the process which leads to this kind of
failure.

Tests:
Reran the testcase 200 times without a failure.

Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 5 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/21000/3
--
To view, visit http://gerrit.cloudera.org:8080/21000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
Gerrit-Change-Number: 21000
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12721: Fix flaky tests involving check deleted file fd()

2024-02-05 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21000


Change subject: IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()
..

IMPALA-12721: Fix flaky tests involving check_deleted_file_fd()

check_deleted_file_fd() is introduced in IMPALA-12681, however some
spilling testcases involving check_deleted_file_fd() seem flaky.

This patch fixed the issue by adding a retry to the testcase if a
failure happens. Based on my local test, the file will be removed
after the test even when the test fails and the call to delete the
file handle is ahead of the call to remove the file (This has been
confirmed through additional testing logs). While there is no theory
why this would happen, introducing a retry mechanism has allowed
the test case to run successfully for 200 times without encountering
any failures. It is possible that a delay may be occurring at some
point in the process which leads to this kind of failure.

Tests:
Reran the testcase 200 times without a failure.

Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 5 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/00/21000/1
--
To view, visit http://gerrit.cloudera.org:8080/21000
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I900aab7dc9833015ce140253ff40da28a6ed3ba6
Gerrit-Change-Number: 21000
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 


[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages

2024-01-29 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20943 )

Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3
Gerrit-Change-Number: 20943
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 29 Jan 2024 18:15:18 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols

2024-01-29 Thread Yida Wu (Code Review)
Yida Wu has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/20944 )

Change subject: IMPALA-12125: Support for dumping symbols from RPMs without 
separate symbols
..

IMPALA-12125: Support for dumping symbols from RPMs without separate symbols

Some RPMs contain binaries with debug symbols with no separate
debuginfo package needed. bin/dump_breakpad_symbols.py does not
allow this combination, as it expects a corresponding symbol
package. This adds a --no_symbol_pkg option to dump_breakpad_symbols.py
to turn off the requirement that --pkg be combined with --symbol_pkg.

Testing:
 - Tested with an RPM package with an unstripped impalad binary
 - Tested with the usual RPM + debuginfo RPM combination

Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b
Reviewed-on: http://gerrit.cloudera.org:8080/20944
Reviewed-by: Impala Public Jenkins 
Tested-by: Yida Wu 
---
M bin/dump_breakpad_symbols.py
1 file changed, 22 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved
  Yida Wu: Verified

--
To view, visit http://gerrit.cloudera.org:8080/20944
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b
Gerrit-Change-Number: 20944
Gerrit-PatchSet: 3
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols

2024-01-29 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20944 )

Change subject: IMPALA-12125: Support for dumping symbols from RPMs without 
separate symbols
..


Patch Set 2: Verified+1

Irrelevant failure at 
TestEventProcessingCustomConfigs.test_skipping_older_events


--
To view, visit http://gerrit.cloudera.org:8080/20944
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b
Gerrit-Change-Number: 20944
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 29 Jan 2024 17:32:47 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols

2024-01-29 Thread Yida Wu (Code Review)
Yida Wu has removed a vote on this change.

Change subject: IMPALA-12125: Support for dumping symbols from RPMs without 
separate symbols
..


Removed Verified-1 by Impala Public Jenkins 
--
To view, visit http://gerrit.cloudera.org:8080/20944
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b
Gerrit-Change-Number: 20944
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages

2024-01-28 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20943 )

Change subject: IMPALA-12745: Skip parallel symbol dumping with RPM/DEB packages
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20943/1/bin/dump_breakpad_symbols.py
File bin/dump_breakpad_symbols.py:

http://gerrit.cloudera.org:8080/#/c/20943/1/bin/dump_breakpad_symbols.py@369
PS1, Line 369: status = 1
Do we need a break here when something happens?



--
To view, visit http://gerrit.cloudera.org:8080/20943
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If2885a9cfb36a4f616b539599e7f744bd23552c3
Gerrit-Change-Number: 20943
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 29 Jan 2024 02:28:15 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12125: Support for dumping symbols from RPMs without separate symbols

2024-01-28 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20944 )

Change subject: IMPALA-12125: Support for dumping symbols from RPMs without 
separate symbols
..


Patch Set 1: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20944
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9589b0ed7855fe49c6989ec3dcc51a9e9c4f476b
Gerrit-Change-Number: 20944
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 29 Jan 2024 02:00:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20898/2//COMMIT_MSG@10
PS2, Line 10: spillin
> nit: typo in 'spilling'
Thanks. Done.



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 10:59:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote spilling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20898/3
--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..


Patch Set 2:

(1 comment)

Thanks Csaba for the review.

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/20898/1/tests/custom_cluster/test_scratch_disk.py@286
PS1, Line 286: # Look for the files with keywords 'scratch' and '(deleted)'.
> Can you mention IMPALA-12698 and that there are temporary files like this c
Done



--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 15 Jan 2024 09:49:34 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-15 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20898 )

Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote splling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 6 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20898/2
--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12698: Restrict check deleted file fd() for fixing flaky tests

2024-01-14 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20898


Change subject: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky 
tests
..

IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
to detect a bug related to remote splling where local temporary file
handles were not being released after deletion. However, the tests
associated with this function seem flaky in exhaustive builds with
occasionally some files of hdfs may not be promptly released after
deletion, though locally, I observed that these files are eventually
removed from /proc/xx/fd in a few minutes, the reason is unclear
yet.

To fix the flaky build failure, this patch confines the scope of
check_deleted_file_fd() to detect files containing the keyword
"scratch" only. Given that hdfs files eventually get removed, and
it seems not an urgent issue, a separate Jira will be filed to track
and investigate this behavior further.

Testing:
Reran the tests a couple times and passed.

Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
---
M tests/custom_cluster/test_scratch_disk.py
1 file changed, 2 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20898/1
--
To view, visit http://gerrit.cloudera.org:8080/20898
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
Gerrit-Change-Number: 20898
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Wenzhe Zhou 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-08 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..


Patch Set 7:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20852/6/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/20852/6/tests/custom_cluster/test_scratch_disk.py@288
PS6, Line 288:   result = subprocess.check_output(command, shell=True)
 :   return result.strip()
> nit: -2 indentation
Done


http://gerrit.cloudera.org:8080/#/c/20852/6/tests/custom_cluster/test_scratch_disk.py@291
PS6, Line 291:   if not e.output:
 : # If there is no output, return None.
 : return None
 :   return "Error checking the fd path with error 
'{}'".format(e)
> nit: -2 indentation
Done



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 7
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 08 Jan 2024 12:10:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-08 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..

IMPALA-12681: Release file descriptors for partially written temporary files

This patch fixes a bug where partially written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Passed core tests.
Additionally, a check has been added in the test
test_scratch_disk.py to verify that there are no deleted
files in the /proc/x/fd/ directory.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
M be/src/runtime/tmp-file-mgr-test.cc
M tests/custom_cluster/test_scratch_disk.py
6 files changed, 41 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/7
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 7
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-05 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20852/5/tests/custom_cluster/test_scratch_disk.py
File tests/custom_cluster/test_scratch_disk.py:

http://gerrit.cloudera.org:8080/#/c/20852/5/tests/custom_cluster/test_scratch_disk.py@300
PS5, Line 300: for impalad in self.cluster.impalads:
> You can get the list of impalads with
Thanks, this is better. Done.


http://gerrit.cloudera.org:8080/#/c/20852/5/tests/custom_cluster/test_scratch_disk.py@335
PS5, Line 335:   @pytest.mark.execute_serially
> It would make sense to check this on each scratch test, not just on remote
Done



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 05 Jan 2024 22:17:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-05 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..

IMPALA-12681: Release file descriptors for partially written temporary files

This patch fixes a bug where partially written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Passed core tests.
Additionally, a check has been added in the test
test_scratch_disk.py to verify that there are no deleted
files in the /proc/x/fd/ directory.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
M be/src/runtime/tmp-file-mgr-test.cc
M tests/custom_cluster/test_scratch_disk.py
6 files changed, 41 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/6
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-05 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..


Patch Set 5:

The new patch fixes some test cases, and added a check in test_scratch_disk.py 
to automatically check the path. Also manually tested the case to cancel the 
query in the middle during spilling, everything looks fine, I don't see any 
deleted files in /proc/x/fd.


--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 5
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 05 Jan 2024 14:20:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-05 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..

IMPALA-12681: Release file descriptors for partially written temporary files

This patch fixes a bug where partially written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Passed core tests.
Additionally, a check has been added in the test
test_scratch_disk.py to verify that there are no deleted
files in the /proc/x/fd/ directory.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/disk-io-mgr-test.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
M be/src/runtime/tmp-file-mgr-test.cc
M tests/custom_cluster/test_scratch_disk.py
6 files changed, 50 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/5
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 5
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-05 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..


Patch Set 4:

Thanks Csaba and Abhishek for the review. The patch still needs to change one 
or two tests before merge. Still testing, will upload a new patch soon.


--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 05 Jan 2024 11:42:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@7
PS3, Line 7: IMPALA-12681: Release file descriptors for partially written 
temporary files
> nit: partially
Done


http://gerrit.cloudera.org:8080/#/c/20852/3//COMMIT_MSG@9
PS3, Line 9: This patch fixes a bug where partially written temporary files are
> nit: This patch fixes a bug where partially written temporary files...
Done



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 14:47:58 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partially written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partially written 
temporary files
..

IMPALA-12681: Release file descriptors for partially written temporary files

This patch fixes a bug where partially written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Manual testing of remote spilling, configure a local path as the
local buffer and a S3 remote path for remote storage in scratch dir.
Ran tpcds q78 with a low buffer_pool_limit in the query option to
trigger spilling, and checked the path of local buffer path of the
scratch dir that the temporary files did exist during running the
query.
Without the fix, after running the query, we can find references
of deleted files in /proc/x/fd.
With the patch that there are no references to the deleted
files in the /proc/x/fd/.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/4
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20852/2//COMMIT_MSG@16
PS2, Line 16: Testing:
> can you add a bit more context here or in the Jira on how to trigger this?
The query that can spill (contains something like aggregation or hash join or 
others) should be good enough, in my test, I use tpcds q78, dont need to 
cancel, simply monitor the /proc/x/fd. Added some comments.



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 12:34:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Manual testing of remote spilling, configure a local path as the
local buffer and a S3 remote path for remote storage in scratch dir.
Ran tpcds q78 with a low buffer_pool_limit in the query option to
trigger spilling, and checked the path of local buffer path of the
scratch dir that the temporary files did exist during running the
query.
Without the fix, after running the query, we can find references
of deleted files in /proc/x/fd.
With the patch that there are no references to the deleted
files in the /proc/x/fd/.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/3
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc
File be/src/runtime/io/disk-file.cc:

http://gerrit.cloudera.org:8080/#/c/20852/1/be/src/runtime/io/disk-file.cc@41
PS1, Line 41:   // Close the file writer to release the file handle.
:   RETURN_IF_ERROR(file_writer_->Close());
: }
> Is it safe to delete it here, so can we exclude the possibility of having p
Thanks Csaba! Yeah, it is better to call Close() here and add a DCHECK in the 
destructor of file writer. Changed.



--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 04 Jan 2024 09:06:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-04 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20852 )

Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() of the local file writer
during the Delete() of the DiskFile class, which could be called
when the local buffer file is being evicted or the query ends,
ensuring proper release of the file handle.

Testing:
Manual testing of remote spilling, with the patch that there are
no references to the deleted files in the /proc/x/fd/. Conversely,
in the absence of the patch, such references are observed.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 11 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/2
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-12681: Release file descriptors for partial written temporary files

2024-01-03 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20852


Change subject: IMPALA-12681: Release file descriptors for partial written 
temporary files
..

IMPALA-12681: Release file descriptors for partial written temporary files

The patch fixes a bug that the partial written temporary files are
removed without releasing the file descriptors. This patch fixes
the bug by adding a call to Close() during the destruction of the
LocalFileWriter class, ensuring proper release of the file handle.
Additionally, the patch introduces the immediate release of the
file handle by adding the file writer reset when the Delete()
is called within the DiskFile class.

Testing:
Manual testing of remote spilling, with the patch that there are
no references to the deleted files in the /proc/x/fd/. Conversely,
in the absence of the patch, such references are observed.

Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
---
M be/src/runtime/io/disk-file.cc
M be/src/runtime/io/local-file-writer.cc
M be/src/runtime/io/local-file-writer.h
3 files changed, 13 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/52/20852/1
--
To view, visit http://gerrit.cloudera.org:8080/20852
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I58a2bac419ced806d6f5a32bcdf24d79e078ab14
Gerrit-Change-Number: 20852
Gerrit-PatchSet: 1
Gerrit-Owner: Yida Wu 


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-17 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory which largely exceeds our memory estimates and it is very
hard to predict.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. In our case, each execution engine would have
only one module, and after the compilation of the module, the
compiled codegened functions of the module would be set to the
execution engine, therefore functions could be used by Impala.
During function compilation within the module, if an ObjectCache
is set to the execution engine, the compiled codegened functions
would be also written into the cache. This way, if we keep the
cache, when revisiting the same module (fragment), we can
efficiently reuse the specific ObjectCache, loading pre-compiled
codegened functions and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(1)  | parquet / none / none | 0.22| -0.65% | 0.20   | -0.75% 
|
+--+---+-++++
+--+--+---++-++++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++++---++-+---+
| TPCH(1)  | TPCH-Q13 | parquet / none / none | 0.49   | 0.47|   +2.80% 
  |   5.32%|   5.07%| 10|   +1.22%   | 1.63| 1.19  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.16   | 0.16|   +3.51% 
  |   1.32%| * 10.38% * | 10|   +0.06%   | 0.49| 1.06  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.12   | 0.12|   +1.39% 
  |   2.27%|   2.24%| 10|   +1.50%   | 1.90| 1.37  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.21   | 0.21|   +1.56% 
  | * 10.02% * | * 11.42% * | 10|   +1.18%   | 0.57| 0.32  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 0.27   | 0.27|   +1.71% 
  |   6.46%|   1.29%| 10|   -0.19%   | -1.19   | 0.81  |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.11   | 0.11|   +0.79% 
  |   2.76%|   2.15%| 10|   +0.10%   | 1.46| 0.71  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.26   | 0.26|   +0.71% 
  |   6.63%|   6.18%| 10|   +0.04%   | 0.49| 0.25  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 0.17   | 0.17|   +0.41% 
  | * 14.66% * | * 13.01% * | 10|   +0.05%   | 0.40| 0.07  |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.16   | 0.16|   +0.19% 
  |   1.41%|   1.39%| 10|   +0.25%   | 1.46| 0.31  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.17   | 0.17|   +0.22% 
  |   1.70%|   1.77%| 10|   -0.05%   | -0.40   | 0.28  |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.16   | 0.16|   -0.27% 
  |   0.54%|   1.46%| 10|   +0.14%   | 0.93| -0.54 |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.11   | 0.11|   -0.38% 
  |   0.81%|   2.06%| 10|   +0.03%   | 0.22| -0.54 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.17   | 0.17|   -0.38% 
  |   0.67%|   1.58%| 10|   -0.01%   | -0.13   | -0.70 |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.27   | 0.27|   -0.08% 
  |   1.24%|   1.15%| 10|   -0.33%   | -1.37   | -0.15 |
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.16   | 0.16|   -1.18% 
  | * 16.61% * | * 10.25% * | 10|   +0.33%   | 0.40| -0.19 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.22   | 0.22|   -1.67% 
  |   1.62%|   7.45%| 10|   +0.43%   | 1.02| -0.70 |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.22   | 0.22|   -0.98% 
  |   0.22%|   1.55%| 

[Impala-ASF-CR] Revert "IMPALA-11805: Fix LLVM memory manager bytes allocated"

2023-12-14 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20796 )

Change subject: Revert "IMPALA-11805: Fix LLVM memory manager bytes allocated"
..


Patch Set 1: Code-Review+1

Thanks Michael for the prompt response. Wait to see if there are any additional 
comments. If not, I'll be ready to give it a +2.


--
To view, visit http://gerrit.cloudera.org:8080/20796
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7f7a6b21b3a9fc7c9e675c8d3349725c4863f744
Gerrit-Change-Number: 20796
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 15 Dec 2023 00:13:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-11 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 7:

(1 comment)

Rebased and updated the commit message over the perf A-B result.

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@28
PS3, Line 28: compared to the previous use of ExecutionEngine. Post-change,
> https://jenkins.impala.io/job/perf-AB-test-ub2004/29 for example shows
Thanks Michael. Yeah, that makes sense. Reran the perf A-B test in 
https://jenkins.impala.io/job/perf-AB-test-ub2004/33/. And added below to the 
commit message.
+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(1)  | parquet / none / none | 0.22| -0.65% | 0.20   | -0.75% 
|
+--+---+-++++



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 7
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Dec 2023 06:06:26 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-11 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory which largely exceeds our memory estimates and it is very
hard to predict.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. In our case, each execution engine would have
only one module, after the compilation of the module, the compiled
codegened functions of the module would be set to the execution
engine, therefore funtions could be used by Impala. During function
compilation within the module, if an ObjectCache is set to the
execution engine, the compiled codegened functions would be also
written into the cache. This way, if we keep the cache, when
revisiting the same module (fragment), we can efficiently reuse
the specific ObjectCache, loading pre-compiled codegened functions
and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+---+-++++
| Workload | File Format   | Avg (s) | Delta(Avg) | GeoMean(s) | 
Delta(GeoMean) |
+--+---+-++++
| TPCH(1)  | parquet / none / none | 0.22| -0.65% | 0.20   | -0.75% 
|
+--+---+-++++
+--+--+---++-++++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++++---++-+---+
| TPCH(1)  | TPCH-Q13 | parquet / none / none | 0.49   | 0.47|   +2.80% 
  |   5.32%|   5.07%| 10|   +1.22%   | 1.63| 1.19  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.16   | 0.16|   +3.51% 
  |   1.32%| * 10.38% * | 10|   +0.06%   | 0.49| 1.06  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.12   | 0.12|   +1.39% 
  |   2.27%|   2.24%| 10|   +1.50%   | 1.90| 1.37  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.21   | 0.21|   +1.56% 
  | * 10.02% * | * 11.42% * | 10|   +1.18%   | 0.57| 0.32  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 0.27   | 0.27|   +1.71% 
  |   6.46%|   1.29%| 10|   -0.19%   | -1.19   | 0.81  |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.11   | 0.11|   +0.79% 
  |   2.76%|   2.15%| 10|   +0.10%   | 1.46| 0.71  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.26   | 0.26|   +0.71% 
  |   6.63%|   6.18%| 10|   +0.04%   | 0.49| 0.25  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 0.17   | 0.17|   +0.41% 
  | * 14.66% * | * 13.01% * | 10|   +0.05%   | 0.40| 0.07  |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.16   | 0.16|   +0.19% 
  |   1.41%|   1.39%| 10|   +0.25%   | 1.46| 0.31  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.17   | 0.17|   +0.22% 
  |   1.70%|   1.77%| 10|   -0.05%   | -0.40   | 0.28  |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.16   | 0.16|   -0.27% 
  |   0.54%|   1.46%| 10|   +0.14%   | 0.93| -0.54 |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.11   | 0.11|   -0.38% 
  |   0.81%|   2.06%| 10|   +0.03%   | 0.22| -0.54 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.17   | 0.17|   -0.38% 
  |   0.67%|   1.58%| 10|   -0.01%   | -0.13   | -0.70 |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.27   | 0.27|   -0.08% 
  |   1.24%|   1.15%| 10|   -0.33%   | -1.37   | -0.15 |
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.16   | 0.16|   -1.18% 
  | * 16.61% * | * 10.25% * | 10|   +0.33%   | 0.40| -0.19 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.22   | 0.22|   -1.67% 
  |   1.62%|   7.45%| 10|   +0.43%   | 1.02| -0.70 |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.22   | 0.22|   -0.98% 
  |   0.22%|   1.55%| 10

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-10 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@28
PS3, Line 28: compared to the previous use of ExecutionEngine. Post-change,
> Doesn't the benchmarking framework print a grand total? If not, we can let
Double checked, but it doesn't seem to have the grand total of our benchmarking 
framework. Filed IMPALA-12615 for it.



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 11 Dec 2023 03:18:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-07 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20733/5/be/src/codegen/llvm-codegen-cache-test.cc
File be/src/codegen/llvm-codegen-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20733/5/be/src/codegen/llvm-codegen-cache-test.cc@39
PS5, Line 39: const int ENGINE_CACHE_SIZE = 990;
> This is larger than the value that was here before. Was the test failing un
Yeah, this is the real number of the cache containing one compiled function. 
The earlier count was misleading as the cache was empty. I've adjusted the 
testcase, and added LlvmCodeGenCacheTest::CheckObjCacheExists to confirm the 
existence of contents within the cache.


http://gerrit.cloudera.org:8080/#/c/20733/5/be/src/codegen/llvm-codegen-cache-test.cc@176
PS5, Line 176: // The function is to create and return a CodeGenObjectCache 
which contains a specific
> nit: "contains" not "contians"
Done



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 6
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 08 Dec 2023 00:28:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-07 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory which largely exceeds our memory estimates and it is very
hard to predict.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. In our case, each execution engine would have
only one module, after the compilation of the module, the compiled
codegened functions of the module would be set to the execution
engine, therefore funtions could be used by Impala. During function
compilation within the module, if an ObjectCache is set to the
execution engine, the compiled codegened functions would be also
written into the cache. This way, if we keep the cache, when
revisiting the same module (fragment), we can efficiently reuse
the specific ObjectCache, loading pre-compiled codegened functions
and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-07 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20733/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/4//COMMIT_MSG@17
PS4, Line 17:  after the compila
> Could you clarify the relationship of the module and exec engine here? I.e.
Yes. Added some comments.



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 5
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 07 Dec 2023 13:38:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-07 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@28
PS3, Line 28: compared to the previous use of ExecutionEngine. Post-change,
> Doesn't the benchmarking framework print a grand total? If not, we can let
It seems our benchmarking framework doesn't print the total. Will double check 
and file a jira if we don't have it.


http://gerrit.cloudera.org:8080/#/c/20733/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/4//COMMIT_MSG@18
PS4, Line 18:  the executi
> Why are they pre-compiled? Aren't they compiled already? Or you mean they a
I think the "pre-compiled" means it is compiled in the last round and ready for 
use. Yeah, maybe call it "compiled" is better, because in this case it means 
the first round to write the compiled stuff into the cache.


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-cache.h@164
PS4, Line 164: cached_engine
> Thinking about this, wouldn't 'cached_engine...' express the meaning better
Done


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-cache.cc
File be/src/codegen/llvm-codegen-cache.cc:

http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-cache.cc@155
PS4, Line 155: y
> I think removing the negation and inverting the branches is better, one les
Done


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc
File be/src/codegen/llvm-codegen-object-cache.cc:

http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc@39
PS4, Line 39: uffer().size();
> Is a copy necessary here? Can't we move the buffer somehow?
CodeGenObjectCache doesn't have the ownership of ObjBuffer, the llvm engine may 
use the ObjBuffer later, so it is safer to make a copy. (And MemoryBufferRef 
may not allow to move the real Buffer, the example I can see is to do a copy)


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc@42
PS4, Line 42: // can conserve memory.
> Shouldn't we add a DCHECK here? If this happens it is a bug.
Done


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc@57
PS4, Line 57:   // ID. If the ID doesn't match the one in the object cache, a 
warning is logged for
> Will it always copy the buffer? Is there a way to access it without copying
>From the definition of the interface getObject(), "Returns a pointer to a 
>newly allocated MemoryBuffer that contains the object which corresponds with 
>Module M", it would be safer to return the copy of the buffer. 
>https://llvm.org/doxygen/classllvm_1_1ObjectCache.html#details


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc@60
PS4, Line 60:   std::lock_guard l(mutex_);
> Shouldn't we add a DCHECK here? If this happens it is a bug.
Done


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen-object-cache.cc@63
PS4, Line 63:   << cached_obj_->getMemBufferRef().getBufferSize();
> Can this function be called when 'key_' is empty? Isn't that a bug?
If there is no cache yet, would return nullptr, for example, the llvm engine 
would look up the cache to see if it contains the pre-compiled module, if it 
doesn't contain any, the llvm engine would proceed to compile the module and 
then write compiled module to the cache. This happens in cache look up and 
doesn't hit any.


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.h@330
PS4, Line 330:
> See comment on llvm-codegen-cache.h:164.
Changed the one in llvm-codegen-cache.h, but would prefer to keep this, because 
this one is the cache for llvm engine to write the compiled functions to, it 
doesn't have to be anything "cached". Added some comments.


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.h@925
PS4, Line 925:
> Do I understand it correctly that this is for writing to the cache (storing
Yes, we can say that the engine_cache_ is for writing, and engine_cache_cached_ 
is for reading. Added some comments.


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.h@929
PS4, Line 929:
> We could extend the comment with what is written on llvm-codegen.cc:1499, i
Added some comments.


http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20733/4/be/src/codegen/llvm-codegen.cc@1192
PS4, 

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-07 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory which largely exceeds our memory estimates and it is very
hard to predict.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. In our case, each execution engine would have
only one module, after the compilation of the module, the compiled
codegened functions of the module would be set to the execution
engine, therefore funtions could be used by Impala. During function
compilation within the module, if an ObjectCache is set to the
execution engine, the compiled codegened functions would be also
written into the cache. This way, if we keep the cache, when
revisiting the same module (fragment), we can efficiently reuse
the specific ObjectCache, loading pre-compiled codegened functions
and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 

[Impala-ASF-CR] IMPALA-11805: Fix LLVM memory manager bytes allocated

2023-12-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20697 )

Change subject: IMPALA-11805: Fix LLVM memory manager bytes allocated
..


Patch Set 13: Code-Review+1

LGTM!!
Considering the rise in bytes_allocated after the change, have you run 
exhaustive tests to ensure the memory tracker can handle the increased demand?


--
To view, visit http://gerrit.cloudera.org:8080/20697
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5a0193a1694db0718d75651dc2b2f9f92c0d6064
Gerrit-Change-Number: 20697
Gerrit-PatchSet: 13
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 04 Dec 2023 02:01:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-02 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 4:

(12 comments)

Thanks Daniel for reviewing.

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@12
PS3, Line 12: which largely exceeds our memo
> I think this should come before "which largely exceeds...", because "which
Done


http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@17
PS3, Line 17:
> Nit: space before '('.
Done


http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@18
PS3, Line 18: pre-compiled
: codegened
> Do you mean the codegened functions? First I thought it referred to the fun
Yeah, changed it to pre-compiled codegened functions.


http://gerrit.cloudera.org:8080/#/c/20733/3//COMMIT_MSG@28
PS3, Line 28:
> Could you also include the aggregate result over all queries?
Do you mean run the time running all the tpch queries for once? Did a test, run 
only once for all the TPCH queries, the result of TPCH-Q1 for Execution Engine 
looks suspicious (maybe because it is the first of all to run), but all others 
look similar.
Query   Execution EngineObject Cache
Q1  4   1.68
Q10 1.521.67
Q11 0.8 0.8
Q12 0.680.58
Q13 1.091.29
Q14 0.820.82
Q15 0.890.88
Q16 0.8 0.75
Q17 1.091.09
Q18 1.151.25
Q19 2.072.11
Q2  1.361.36
Q20 1.051.06
Q21 1.911.91
Q22 0.630.63
Q3  0.940.93
Q4  0.480.47
Q5  1.451.45
Q6  0.260.26
Q7  2.2 2.15
Q8  1.811.81
Q9  1.851.85
Total   28.85   26.8


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc
File be/src/codegen/llvm-codegen-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@99
PS3, Line 99: dule_id, bool
> We could include the parameter names here and on the next line if 'codegen'
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@115
PS3, Line 115:
> Maybe 'cached_engine' would be better, also on L117.
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@132
PS3, Line 132:
> I'd prefer something like 'codegen_obj_cache_', it's easier to get what it
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@138
PS3, Line 138:
> Could it be const?
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@142
PS3, Line 142:
> Could use 'nullptr' instead.
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache-test.cc@225
PS3, Line 225: te a Llvm
> 'module_id_echo' would be better, cf. 'module_id_double'.
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache.h@231
PS3, Line 231: engine_cache_raw
> Maybe 'engine_cache_raw[_ptr]' would be better.
Done


http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache.cc
File be/src/codegen/llvm-codegen-cache.cc:

http://gerrit.cloudera.org:8080/#/c/20733/3/be/src/codegen/llvm-codegen-cache.cc@154
PS3, Line 154:   Slice key;
> The assignment could be made in an IF-ELSE (or a ? : operator). Now we get/
Done



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 4
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Sat, 02 Dec 2023 13:15:54 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-12-02 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory which largely exceeds our memory estimates and it is very
hard to predict.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. The approach involves associating an ObjectCache
with the module (execution engine). During function compilation
within the module, if an ObjectCache is set, the pre-compiled
codegened functions are written into the cache. This way, when
revisiting the same module (fragment), we can efficiently reuse
the specific ObjectCache, loading pre-compiled codegened functions
and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 |   -0.32%   | -0.06   | -0.30 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.64   | 0.67|   -4.64% 
  |   1.13%   | * 12.59% * | 8 |   -0.28%   | -0.19   | -1.04 |
| 

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-11-30 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache-test.cc
File be/src/codegen/llvm-codegen-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache-test.cc@298
PS2, Line 298:   // 172B for optimal mode, or 180B on ARM systems
> Oh, I meant the largest difference I saw between normal and optimal mode is
Did a test in both arm and x86. In our testcase of mem_charge_1, the mem charge 
in arm is always 8B larger than x86 in normal and optimal mode.
For mem_charge_1, arm 188B/180B, x86 180B/172B. The current size setting should 
be good for testing, and I can pass the test in arm.


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@1385
PS2, Line 1385:
> So bytes_allocated is still needed, which means we still want https://gerri
>From the source code and gdb debugging that I can see, the bytes_allocated 
>will be used when loading from the pre-compiled object. 
>https://llvm.org/doxygen/MCJIT_8cpp_source.html#l00224. While the manage usage 
>of pre-compiled object is currently managed by the codegen cache if codegen 
>cache is enabled, the bytes_allocated is managed by the runtime execution 
>engine, I think bytes_allocated is still useful for runtime use no matter 
>codegen cache is on or off.



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 30 Nov 2023 10:00:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-11-29 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..


Patch Set 3:

(12 comments)

Thanks Michael for the review.

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache-test.cc
File be/src/codegen/llvm-codegen-cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache-test.cc@298
PS2, Line 298:   // 172B for optimal mode, or 180B on ARM systems
> I experimented with this. In practice I saw minimal (<= 8B) difference betw
Adjusted the value after adding sizeof(CodeGenObjectCache).


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen-cache.h@308
PS2, Line 308:   std::unordered_map>
> This would also work as a std::unordered_set. https://en.cppreference.com/w
When creating a temporary shared_ptr from the raw pointer, there might be a 
double-free problem. I'm uncertain if there's a more optimal solution. However, 
considering the cache entry is typically over 1k bytes while the 8-byte pointer 
is relatively small, it doesn't seem like a significant concern in this context.


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.h
File be/src/codegen/llvm-codegen.h:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.h@33
PS2, Line 33: #include 
> What's this used for?
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.h@353
PS2, Line 353:   /// retrieval. If module_id is not nullptr, the final module 
id is returned.
> Can you expand what module_id is used for. Would help to identify when to p
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.h@920
PS2, Line 920:   llvm::Module* module_;
> engine_cache_wrapper no longer owns an execution engine, so this comment is
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@a1185
PS2, Line 1185:
> I thought this comment was kind of useful. And I see you still use it at li
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@496
PS2, Line 496:   engine_cache_ = make_shared();
> Why initialize it this way? You could also just pass "new CodegenObjectCach
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@498
PS2, Line 498:   RETURN_IF_ERROR(LoadIntrinsics());
> You could directly assign to execution_engine_ on line 479. Lots of extra m
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@499
PS2, Line 499:
> Why not assign directly to symbol_emitter_?
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-codegen.cc@1385
PS2, Line 1385:
> Is this still right? Shouldn't it be the object cache size, since that's wh
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-object-cache-wrapper.h
File be/src/codegen/llvm-object-cache-wrapper.h:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-object-cache-wrapper.h@44
PS2, Line 44:
> possibly other objects seems weird as there aren't other objects. Do we rea
Done


http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-object-cache-wrapper.cc
File be/src/codegen/llvm-object-cache-wrapper.cc:

http://gerrit.cloudera.org:8080/#/c/20733/2/be/src/codegen/llvm-object-cache-wrapper.cc@66
PS2, Line 66:
> Is sizeof(CodeGenObjectCache) accounted for elsewhere?
Yeah, this is a good point. Added here.



--
To view, visit http://gerrit.cloudera.org:8080/20733
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic3c1b46bb9018ed0320817141785a3bdc41fa677
Gerrit-Change-Number: 20733
Gerrit-PatchSet: 3
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Wed, 29 Nov 2023 14:37:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-11-29 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory and it is very hard to predict which largely exceeds our
memory estimates.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. The approach involves associating an ObjectCache
with the module(execution engine). During function compilation
within the module, if an ObjectCache is set, the pre-compiled
functions are written into the cache. This way, when revisiting
the same module (fragment), we can efficiently reuse the specific
ObjectCache, loading pre-compiled functions and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 |   -0.32%   | -0.06   | -0.30 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.64   | 0.67|   -4.64% 
  |   1.13%   | * 12.59% * | 8 |   -0.28%   | -0.19   | -1.04 |
| TPCH(1)  | TPCH-Q13 | 

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-11-27 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20733 )

Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory and it is very hard to predict which largely exceeds our
memory estimates.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. The approach involves associating an ObjectCache
with the module(execution engine). During function compilation
within the module, if an ObjectCache is set, the pre-compiled
functions are written into the cache. This way, when revisiting
the same module (fragment), we can efficiently reuse the specific
ObjectCache, loading pre-compiled functions and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 |   -0.32%   | -0.06   | -0.30 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.64   | 0.67|   -4.64% 
  |   1.13%   | * 12.59% * | 8 |   -0.28%   | -0.19   | -1.04 |
| TPCH(1)  | TPCH-Q13 | 

[Impala-ASF-CR] IMPALA-11805: Use llvm ObjectCache for codegen caching

2023-11-27 Thread Yida Wu (Code Review)
Yida Wu has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/20733


Change subject: IMPALA-11805: Use llvm ObjectCache for codegen caching
..

IMPALA-11805: Use llvm ObjectCache for codegen caching

Currently, we employ llvm::ExecutionEngine for codegen caching,
providing access to compiled functions within the cached engine.
However, the real challenge is the ExecutionEngine uses a lot of
memory and it is very hard to predict which largely exceeds our
memory estimates.

This patch addresses this issue by using llvm::ObjectCache for
codegen caching. The approach involves associating an ObjectCache
with the module(execution engine). During function compilation
within the module, if an ObjectCache is set, the pre-compiled
functions are written into the cache. This way, when revisiting
the same module (fragment), we can efficiently reuse the specific
ObjectCache, loading pre-compiled functions and saving time.

The tpch performance test indicates no significant regression
compared to the previous use of ExecutionEngine. Post-change,
the actual memory usage of each codegen caching entry is notably
reduced.

+--+--+---++-++---++---++-+---+
| Workload | Query| File Format   | Avg(s) | Base Avg(s) | 
Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | 
Tval  |
+--+--+---++-++---++---++-+---+
| TPCH(1)  | TPCH-Q15 | parquet / none / none | 0.48   | 0.46|   +5.16% 
  |   4.31%   |   4.78%| 8 |   +3.51%   | 2.11| 2.22  |
| TPCH(1)  | TPCH-Q17 | parquet / none / none | 1.28   | 1.25|   +1.76% 
  |   1.83%   |   1.88%| 8 |   +3.53%   | 1.34| 1.88  |
| TPCH(1)  | TPCH-Q8  | parquet / none / none | 0.77   | 0.75|   +3.11% 
  |   3.88%   |   2.75%| 8 |   +0.79%   | 0.83| 1.81  |
| TPCH(1)  | TPCH-Q7  | parquet / none / none | 0.72   | 0.71|   +1.89% 
  |   3.48%   |   3.15%| 8 |   +1.61%   | 0.83| 1.13  |
| TPCH(1)  | TPCH-Q22 | parquet / none / none | 0.43   | 0.41|   +3.40% 
  |   1.21%   |   6.11%| 8 |   +0.05%   | 0.45| 1.54  |
| TPCH(1)  | TPCH-Q10 | parquet / none / none | 0.74   | 0.73|   +2.04% 
  |   0.75%   |   2.57%| 8 |   +1.13%   | 2.11| 2.16  |
| TPCH(1)  | TPCH-Q20 | parquet / none / none | 0.55   | 0.54|   +1.42% 
  |   2.95%   |   1.14%| 8 |   +0.19%   | 0.57| 1.25  |
| TPCH(1)  | TPCH-Q11 | parquet / none / none | 0.25   | 0.25|   +0.37% 
  |   2.41%   |   3.09%| 8 |   +0.82%   | 0.45| 0.27  |
| TPCH(1)  | TPCH-Q19 | parquet / none / none | 0.62   | 0.61|   +0.92% 
  |   3.17%   |   1.53%| 8 |   +0.05%   | 0.19| 0.73  |
| TPCH(1)  | TPCH-Q4  | parquet / none / none | 0.42   | 0.42|   +0.37% 
  |   1.24%   |   1.08%| 8 |   +0.03%   | 0.19| 0.63  |
| TPCH(1)  | TPCH-Q5  | parquet / none / none | 0.72   | 0.72|   +0.19% 
  |   3.08%   |   3.23%| 8 |   +0.07%   | 0.57| 0.12  |
| TPCH(1)  | TPCH-Q18 | parquet / none / none | 1.20   | 1.19|   +0.04% 
  |   0.12%   |   0.15%| 8 |   +0.03%   | 0.57| 0.53  |
| TPCH(1)  | TPCH-Q9  | parquet / none / none | 1.13   | 1.13|   +0.02% 
  |   2.05%   |   2.04%| 8 |   -0.01%   | -0.57   | 0.02  |
| TPCH(1)  | TPCH-Q3  | parquet / none / none | 0.60   | 0.60|   +0.03% 
  |   3.71%   |   3.99%| 8 |   -0.04%   | -0.06   | 0.01  |
| TPCH(1)  | TPCH-Q2  | parquet / none / none | 0.44   | 0.44|   -0.70% 
  |   8.56%   |   7.42%| 8 |   +0.61%   | 0.45| -0.18 |
| TPCH(1)  | TPCH-Q6  | parquet / none / none | 0.31   | 0.31|   -0.07% 
  |   1.73%   |   1.56%| 8 |   -0.06%   | -1.34   | -0.09 |
| TPCH(1)  | TPCH-Q14 | parquet / none / none | 0.48   | 0.48|   -0.25% 
  |   1.20%   |   1.47%| 8 |   -0.09%   | -0.57   | -0.37 |
| TPCH(1)  | TPCH-Q12 | parquet / none / none | 0.54   | 0.54|   -0.86% 
  |   1.90%   |   4.48%| 8 |   +0.11%   | 0.32| -0.50 |
| TPCH(1)  | TPCH-Q21 | parquet / none / none | 1.35   | 1.37|   -1.56% 
  |   3.11%   |   2.55%| 8 |   -0.19%   | -0.70   | -1.11 |
| TPCH(1)  | TPCH-Q16 | parquet / none / none | 0.26   | 0.27|   -1.74% 
  |   9.53%   | * 13.61% * | 8 |   -0.32%   | -0.06   | -0.30 |
| TPCH(1)  | TPCH-Q1  | parquet / none / none | 0.64   | 0.67|   -4.64% 
  |   1.13%   | * 12.59% * | 8 |   -0.28%   | -0.19   | -1.04 |
| TPCH(1)  | TPCH-Q13 

[Impala-ASF-CR] IMPALA-12535: Fix misleading metric keys for the threadz page

2023-11-16 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20658 )

Change subject: IMPALA-12535: Fix misleading metric keys for the threadz page
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/20658
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15a8cf0a318bc7122d1f5df29f18d8e467249ef7
Gerrit-Change-Number: 20658
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Comment-Date: Thu, 16 Nov 2023 23:41:43 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12306: (Part 2) Make codegen cache tests with symbol emitter more robust

2023-11-13 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20318 )

Change subject: IMPALA-12306: (Part 2) Make codegen cache tests with symbol 
emitter more robust
..


Patch Set 6: Code-Review+1

(1 comment)

It makes sense to me.

http://gerrit.cloudera.org:8080/#/c/20318/4/be/src/codegen/codegen-symbol-emitter.cc
File be/src/codegen/codegen-symbol-emitter.cc:

http://gerrit.cloudera.org:8080/#/c/20318/4/be/src/codegen/codegen-symbol-emitter.cc@58
PS4, Line 58: stringstream s;
> I think it's a good idea to check it and doesn't make normal execution slow
Done



--
To view, visit http://gerrit.cloudera.org:8080/20318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61b9b0de9c896f3de7eb1be7de33d822b1ab70d0
Gerrit-Change-Number: 20318
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 13 Nov 2023 19:15:25 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12306: (Part 2) Make codegen cache tests with symbol emitter more robust

2023-11-12 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20318 )

Change subject: IMPALA-12306: (Part 2) Make codegen cache tests with symbol 
emitter more robust
..


Patch Set 4: Code-Review+1

(2 comments)

It looks a great idea to me

http://gerrit.cloudera.org:8080/#/c/20318/4/be/src/codegen/codegen-symbol-emitter.cc
File be/src/codegen/codegen-symbol-emitter.cc:

http://gerrit.cloudera.org:8080/#/c/20318/4/be/src/codegen/codegen-symbol-emitter.cc@58
PS4, Line 58:   if (non_freed_objects_ > 0) {
I am not very sure if we also need to consider the abnormal case 
non_freed_objects_ less than 0 in release version, because DCHECK only handles 
debug version. What do you think?


http://gerrit.cloudera.org:8080/#/c/20318/4/be/src/codegen/codegen-symbol-emitter.cc@64
PS4, Line 64:
nit. seems no need for the new line



--
To view, visit http://gerrit.cloudera.org:8080/20318
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I61b9b0de9c896f3de7eb1be7de33d822b1ab70d0
Gerrit-Change-Number: 20318
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 13 Nov 2023 07:14:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12535: Fix misleading metric keys for the threadz page

2023-11-10 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20658 )

Change subject: IMPALA-12535: Fix misleading metric keys for the threadz page
..


Patch Set 2: Code-Review+1

LGTM!


--
To view, visit http://gerrit.cloudera.org:8080/20658
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15a8cf0a318bc7122d1f5df29f18d8e467249ef7
Gerrit-Change-Number: 20658
Gerrit-PatchSet: 2
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Quanlong Huang 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Sat, 11 Nov 2023 06:41:55 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12535: Fix misleading metric keys for the threadz page

2023-11-05 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20658 )

Change subject: IMPALA-12535: Fix misleading metric keys for the threadz page
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20658
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15a8cf0a318bc7122d1f5df29f18d8e467249ef7
Gerrit-Change-Number: 20658
Gerrit-PatchSet: 1
Gerrit-Owner: Quanlong Huang 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Mon, 06 Nov 2023 02:15:39 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12510: Floor PlanFragment.maxParallelism at 1

2023-10-20 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20606 )

Change subject: IMPALA-12510: Floor PlanFragment.maxParallelism_ at 1
..


Patch Set 2: Code-Review+2

(2 comments)

Thanks Riza for the quick fix.

http://gerrit.cloudera.org:8080/#/c/20606/2/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20606/2/fe/src/main/java/org/apache/impala/planner/ScanNode.java@89
PS2, Line 89: 1
nit. seems MIN_NUM_SCAN_THREADS is better


http://gerrit.cloudera.org:8080/#/c/20606/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
File 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test:

http://gerrit.cloudera.org:8080/#/c/20606/2/testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test@1
PS2, Line 1: # select star on empty table
nit. Can we add a comment: # Regression test for IMPALA-12510.



--
To view, visit http://gerrit.cloudera.org:8080/20606
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibfa50abfdb9cdb994c5c3d7904b377a25f5b8b97
Gerrit-Change-Number: 20606
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Sat, 21 Oct 2023 02:48:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12510: Floor PlanFragment.maxParallelism at 1

2023-10-20 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20606 )

Change subject: IMPALA-12510: Floor PlanFragment.maxParallelism_ at 1
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/20606/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20606/1//COMMIT_MSG@9
PS1, Line 9: bug
I assume the bug is only happening over an empty table, could you please 
describe a bit more how the bug looks like, what error it may throw.


http://gerrit.cloudera.org:8080/#/c/20606/1//COMMIT_MSG@14
PS1, Line 14: empty
nit. an empty


http://gerrit.cloudera.org:8080/#/c/20606/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java
File fe/src/main/java/org/apache/impala/planner/PlanFragment.java:

http://gerrit.cloudera.org:8080/#/c/20606/1/fe/src/main/java/org/apache/impala/planner/PlanFragment.java@1139
PS1, Line 1139: 1
How about sharing the same const mentioned in ScanNode.java? Or maybe use a 
separate const for this number?


http://gerrit.cloudera.org:8080/#/c/20606/1/fe/src/main/java/org/apache/impala/planner/ScanNode.java
File fe/src/main/java/org/apache/impala/planner/ScanNode.java:

http://gerrit.cloudera.org:8080/#/c/20606/1/fe/src/main/java/org/apache/impala/planner/ScanNode.java@89
PS1, Line 89: 1
How about we name a const for it? Like MIN_SCANNER_THREADS?


http://gerrit.cloudera.org:8080/#/c/20606/1/fe/src/main/java/org/apache/impala/planner/ScanNode.java@373
PS1, Line 373: 1
May use the const mentioned above.



--
To view, visit http://gerrit.cloudera.org:8080/20606
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibfa50abfdb9cdb994c5c3d7904b377a25f5b8b97
Gerrit-Change-Number: 20606
Gerrit-PatchSet: 1
Gerrit-Owner: Riza Suminto 
Gerrit-Reviewer: Abhishek Rawat 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Riza Suminto 
Gerrit-Reviewer: Wenzhe Zhou 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 20 Oct 2023 20:58:56 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12478: Invalid length in StringValue::LeastSmallerString()

2023-10-16 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20579 )

Change subject: IMPALA-12478: Invalid length in 
StringValue::LeastSmallerString()
..


Patch Set 1: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0888f89070a1efeb9efefd1c3ca96dfa873942fd
Gerrit-Change-Number: 20579
Gerrit-PatchSet: 1
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Yida Wu 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 16 Oct 2023 16:27:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5081: Add codegen opt level query option

2023-09-12 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20399 )

Change subject: IMPALA-5081: Add codegen_opt_level query option
..


Patch Set 7: Code-Review+1

(4 comments)

http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen-cache.h@178
PS7, Line 178: opt_level
nit. do you think it good to add a DCHECK for opt_level to see if it out of 
TCodeGenOptLevel range? Because we will reset the cache entry from existing 
cache entry in the look up process, maybe some abnormal could be detected 
earlier here if opt value goes wrong.


http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen-cache.h@183
PS7, Line 183: num_opt_functions
nit. could you please add comments for the new fields? Would be good to tell 
the difference between the existing num_functions and num_instructions.


http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen-cache.h@189
PS7, Line 189:   TCodeGenOptLevel::type opt_level;
nit. could you please add a simple comment for the field.


http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20399/7/be/src/codegen/llvm-codegen.cc@1454
PS7, Line 1454:   break;
nit. do we need the default case to handle unexpected values?



--
To view, visit http://gerrit.cloudera.org:8080/20399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I371f8758b6552263e91a1fbfd9a6e1c28e1fa2bd
Gerrit-Change-Number: 20399
Gerrit-PatchSet: 7
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 12 Sep 2023 17:51:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5081: Add codegen opt level query option

2023-08-24 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20399 )

Change subject: IMPALA-5081: Add codegen_opt_level query option
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20399/2/be/src/codegen/llvm-codegen-cache.h
File be/src/codegen/llvm-codegen-cache.h:

http://gerrit.cloudera.org:8080/#/c/20399/2/be/src/codegen/llvm-codegen-cache.h@166
PS2, Line 166:   bool Empty() { return engine_pointer == nullptr; }
> Not relevant to this change but wouldn't it be better to call Reset() here
Thanks Daniel for pointing this out. It makes sense to me. I think it was 
because previously the struct contained shared_ptr, and somehow needs some hack 
to initialize it, but now it changes to a pointer. It should be okay if all the 
codegen caching tests can pass.



--
To view, visit http://gerrit.cloudera.org:8080/20399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I371f8758b6552263e91a1fbfd9a6e1c28e1fa2bd
Gerrit-Change-Number: 20399
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Noemi Pap-Takacs 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Fri, 25 Aug 2023 00:09:35 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12253: Improve the integration of codegen cache and async codegen

2023-08-08 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20211 )

Change subject: IMPALA-12253: Improve the integration of codegen cache and 
async codegen
..


Patch Set 2:

Thanks Michael for the review. Passed exhaustive tests.


--
To view, visit http://gerrit.cloudera.org:8080/20211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908
Gerrit-Change-Number: 20211
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 08 Aug 2023 19:39:00 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12253: Improve the integration of codegen cache and async codegen

2023-08-08 Thread Yida Wu (Code Review)
Yida Wu has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/20211 )

Change subject: IMPALA-12253: Improve the integration of codegen cache and 
async codegen
..

IMPALA-12253: Improve the integration of codegen cache and async codegen

To improve the integration of codegen cache and async codegen, the
seperate the FinalizeModule() into two parts, PrepareFinalizeModule()
and FinalizeModule(). If codegen cache enabled,
PrepareFinalizeModule() would generate the key of the module, so that
if there is a cache hit, we could skip the FinalizeModule() no matter
in async or sync. Otherwise, if the cache misses, we would keep going
and call FinalizeModule() to finish the compilation of the module.

This helps the codegen cache to have the advantage of async codegen
for the first visit while reducing the need of creating new threads
for async codegen if cache exists.

Did the perf a-b test with async codegen and codegen cache enabled,
didn't see any significant performance regression.

Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908
---
M be/src/benchmarks/expr-benchmark.cc
M be/src/benchmarks/hash-benchmark.cc
M be/src/codegen/llvm-codegen-cache-test.cc
M be/src/codegen/llvm-codegen-test.cc
M be/src/codegen/llvm-codegen.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/expr-codegen-test.cc
M be/src/runtime/fragment-state.cc
M be/src/runtime/test-env.cc
M be/src/service/fe-support.cc
10 files changed, 125 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/11/20211/2
--
To view, visit http://gerrit.cloudera.org:8080/20211
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic5ae4b342ff8ef1c3b7ce35c927baa8b59d72908
Gerrit-Change-Number: 20211
Gerrit-PatchSet: 2
Gerrit-Owner: Yida Wu 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Smith 


[native-toolchain-CR] IMPALA-12334: Add debug symbols to LLVM assert build

2023-08-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20311 )

Change subject: IMPALA-12334: Add debug symbols to LLVM assert build
..


Patch Set 2:

Paste a link of related doc: https://llvm.org/docs/CMake.html#cmake-build-type


--
To view, visit http://gerrit.cloudera.org:8080/20311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c58bfe2233dcec39fcad92fc52ce372165229d8
Gerrit-Change-Number: 20311
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 03 Aug 2023 18:05:50 +
Gerrit-HasComments: No


[native-toolchain-CR] IMPALA-12334: Add debug symbols to LLVM assert build

2023-08-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20311 )

Change subject: IMPALA-12334: Add debug symbols to LLVM assert build
..


Patch Set 2: Code-Review+1


--
To view, visit http://gerrit.cloudera.org:8080/20311
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: native-toolchain
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9c58bfe2233dcec39fcad92fc52ce372165229d8
Gerrit-Change-Number: 20311
Gerrit-PatchSet: 2
Gerrit-Owner: Michael Smith 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Michael Smith 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 03 Aug 2023 18:04:42 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-12269: Codegen cache false negative because of function names hash

2023-08-03 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20168 )

Change subject: IMPALA-12269: Codegen cache false negative because of function 
names hash
..


Patch Set 6: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20168/5/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20168/5/be/src/codegen/llvm-codegen.cc@1438
PS5, Line 1438:   for (auto& entry : fns_to_jit_compile_) {
> It's already checked by the DCHECKS on L1448 and L1469.
Ack



--
To view, visit http://gerrit.cloudera.org:8080/20168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf1d2b424c969fbba181ab90bf9c7bf22355f139
Gerrit-Change-Number: 20168
Gerrit-PatchSet: 6
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 03 Aug 2023 15:02:38 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12269: Codegen cache false negative because of function names hash

2023-08-01 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20168 )

Change subject: IMPALA-12269: Codegen cache false negative because of function 
names hash
..


Patch Set 5: Code-Review+1

(2 comments)

It looks good. Could Csaba have one more look at the patch after changing to 
use map for codegen functions?

http://gerrit.cloudera.org:8080/#/c/20168/5/be/src/codegen/llvm-codegen.cc
File be/src/codegen/llvm-codegen.cc:

http://gerrit.cloudera.org:8080/#/c/20168/5/be/src/codegen/llvm-codegen.cc@1202
PS5, Line 1202:
nit. unnecessary newline


http://gerrit.cloudera.org:8080/#/c/20168/5/be/src/codegen/llvm-codegen.cc@1438
PS5, Line 1438:   // Get pointers to all codegen'd functions.
Because for SetFunctionPointers() "either both or none of 'cache' and 
'cache_key' should be NULL", can we add a DCHECK here for cache and cache_key 
to make sure they meet our requirement?



--
To view, visit http://gerrit.cloudera.org:8080/20168
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibf1d2b424c969fbba181ab90bf9c7bf22355f139
Gerrit-Change-Number: 20168
Gerrit-PatchSet: 5
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Tue, 01 Aug 2023 23:18:00 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-12292: TestCodegenCache.{test codegen cache with asm module dir,test codegen cache with perf map} fail in builds

2023-07-27 Thread Yida Wu (Code Review)
Yida Wu has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20224 )

Change subject: IMPALA-12292: 
TestCodegenCache.{test_codegen_cache_with_asm_module_dir,test_codegen_cache_with_perf_map}
 fail in builds
..


Patch Set 4: Code-Review+2

Thanks Daniel for fixing this. To enhance the robustness of the testcase, we 
could follow up on IMPALA-12306.


--
To view, visit http://gerrit.cloudera.org:8080/20224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I15320b8c0d06f4d93927b19731c11bd4e15b3690
Gerrit-Change-Number: 20224
Gerrit-PatchSet: 4
Gerrit-Owner: Daniel Becker 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Daniel Becker 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Yida Wu 
Gerrit-Comment-Date: Thu, 27 Jul 2023 18:20:06 +
Gerrit-HasComments: No


  1   2   3   4   5   >