[kudu-CR] [metrics] Add tablet level metrics for scans op time

2024-05-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21291 )

Change subject: [metrics] Add tablet level metrics for scans op time
..

[metrics] Add tablet level metrics for scans op time

We currently have monitoring in place for scan data volume and
slow scans, but we are still lacking monitoring data for scan
request timings.

In this patch, I have added monitoring for scan request timings at
the tablet level to assist us in pinpointing specific scenarios of
high CPU usage during scanning operations.

Change-Id: I8f490cfb6f37aef60b34697100fb502374fcc503
Reviewed-on: http://gerrit.cloudera.org:8080/21291
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M src/kudu/tablet/tablet_metrics.cc
M src/kudu/tablet/tablet_metrics.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/util/stopwatch.h
6 files changed, 237 insertions(+), 45 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8f490cfb6f37aef60b34697100fb502374fcc503
Gerrit-Change-Number: 21291
Gerrit-PatchSet: 7
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] [metrics] Add tablet level metrics for scans op time

2024-05-23 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21291 )

Change subject: [metrics] Add tablet level metrics for scans op time
..


Patch Set 6: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f490cfb6f37aef60b34697100fb502374fcc503
Gerrit-Change-Number: 21291
Gerrit-PatchSet: 6
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Thu, 23 May 2024 19:30:11 +
Gerrit-HasComments: No


[kudu-CR] [metrics] Add tablet level metrics for scans op time

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21291 )

Change subject: [metrics] Add tablet level metrics for scans op time
..


Patch Set 5: Code-Review+1

(12 comments)

Almost there, just a style and wording nits.

Thank you for revving the patch!

http://gerrit.cloudera.org:8080/#/c/21291/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21291/5//COMMIT_MSG@10
PS5, Line 10: scan time
nit: scan request timings

That's less ambiguous since 'scan time' might be confused with the total time 
spent to fetch all the data from all the involved tablet servers based on the 
scan's spec.

>From what I could see, this patch adds histogram metrics to track timings of 
>individual scan requests (both new and continuation ones).


http://gerrit.cloudera.org:8080/#/c/21291/5//COMMIT_MSG@11
PS5, Line 11: scan time
nit: scan request timings


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.h
File src/kudu/tablet/tablet_metrics.h:

http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.h@68
PS5, Line 68:   scoped_refptr tablet_scans_duration_wall_time;
:   scoped_refptr tablet_scans_duration_system_time;
:   scoped_refptr tablet_scans_duration_user_time;
These metrics are already in the tablet context, similar to all the other 
fields in this TabletMetrics struct.

With that, maybe rename these to match the others:

  scan_duration_system_time
  scan_duration_user_time
  scan_duration_wall_time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@157
PS5, Line 157: Tablet Scans Duration In Wall Time
Scan Requests Wall Time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@159
PS5, Line 159: Wall time spent scanners scans
Duration of scan requests, wall time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@164
PS5, Line 164: Tablet Scans Duration In System Time
Scan Requests System Time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@166
PS5, Line 166: System time spent scanners scans.
Duration of scan requests, system time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@171
PS5, Line 171: Tablet Scans Duration In User Time
Scan Requests User Time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tablet/tablet_metrics.cc@173
PS5, Line 173: User time spent scanners scans
Duration of scan requests, user time


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tserver/scanners.h
File src/kudu/tserver/scanners.h:

http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tserver/scanners.h@410
PS5, Line 410: Add the timings in 'elapsed' to the metrics of tablet.
Store the scan request's timings in the tablet metrics.


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tserver/scanners.h@411
PS5, Line 411: update_tablet_metrics
style nit: since this isn't an accessor/getter, this should be named like 
UpdateTabletMetrics()


http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tserver/scanners.cc
File src/kudu/tserver/scanners.cc:

http://gerrit.cloudera.org:8080/#/c/21291/5/src/kudu/tserver/scanners.cc@529
PS5, Line 529: The time spent on this scanner scan
nit: Store scan request's timings



--
To view, visit http://gerrit.cloudera.org:8080/21291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f490cfb6f37aef60b34697100fb502374fcc503
Gerrit-Change-Number: 21291
Gerrit-PatchSet: 5
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Thu, 23 May 2024 03:49:47 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21448 )

Change subject: Fix row_project codegen params noalias overflow
..

Fix row_project codegen params noalias overflow

function->addParamAttr is 0-based indexes, current row_project generator IR 
code is:
`define i1 @ProjRead(i8* %src, %"class.kudu::RowBlockRow"* noalias %rbrow, 
%"class.kudu::Arena"* noalias %arena)`
not same as before.

Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Reviewed-on: http://gerrit.cloudera.org:8080/19952
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
(cherry picked from commit c485c8c3cf4b76d8a55e2ec98e27803341285f75)
Reviewed-on: http://gerrit.cloudera.org:8080/21448
Tested-by: Alexey Serbin 
---
M src/kudu/codegen/row_projector.cc
1 file changed, 1 insertion(+), 2 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21448 )

Change subject: Fix row_project codegen params noalias overflow
..


Patch Set 2:

> By the way, I think kudu codegen sometimes not flexible, it need
 > compiler-clang version and llvm-lib version match.Otherwise
 > precompiled IR can not be complied by llvm-lib due to poor IR-code
 > compatibility.
 > Do we consider use a real JIT to generate code? This will avoid
 > mismatch between clang version and llvm-lib version.

AFAIK, to build its codegen components, Kudu always uses LLVM and CLANG from 
Kudu's thirdparty, and no external components are used for that.  With that, 
there isn't a problem with version mismatch, IIUC.  However, there is an issue 
with thirdparty's CLANG automatically picking up the very latest version of GCC 
toolchain (the STL library, etc.) if it's available at the build machine -- it 
might lead to ABI incompatibility and crashes if the build machine has multiple 
versions of GCC toolchain installed.  The issue is tracked by KUDU-3545.  It 
seems that might be fixed once upgrading Kudu's thirdparty to LLVM/CLANG 16.0.0 
or newer and start using the --gcc-install-dir flag to have more control on the 
GCC toolchain used for Kudu's codegen.

As for utilizing more JIT components/features, so far I don't know about 
currently ongoing or planned projects in this area.  Feel free to file a JIRA 
with detailed description to provide more context on that and share your ideas.

Thank you!


--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2024 02:58:32 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: Fix row_project codegen params noalias overflow
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21448 )

Change subject: Fix row_project codegen params noalias overflow
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2024 02:35:07 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21448 )

Change subject: Fix row_project codegen params noalias overflow
..


Patch Set 2: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 23 May 2024 02:35:01 +
Gerrit-HasComments: No


[kudu-CR] Add a benchmark for CBTree concurrent writes.

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21447 )

Change subject: Add a benchmark for CBTree concurrent writes.
..


Patch Set 1: Code-Review+1

(15 comments)

http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG@32
PS1, Line 32: writing
nit: writer ?

Otherwise, maybe re-phrase it to be "Reading threads ... writing threads ..."


http://gerrit.cloudera.org:8080/#/c/21447/1//COMMIT_MSG@37
PS1, Line 37: eachi
nit: each


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc
File src/kudu/tablet/cbtree-test.cc:

http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@31
PS1, Line 31: #include "kudu/gutil/stringprintf.h"
It seems you'll need to rebase this patch on top the HEAD revision in the 
master branch of the repo to get '#include "kudu/gutil/atomicops.h"' that IWYU 
wants to see here.


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@907
PS1, Line 907:   snprintf(kbuf, 64, "key_%d", i);
nit: consider ether converting this into a functor visible only in the scope 
where the first two arguments are defined (and maybe even using sizeof(...) 
instead of hard-coded constant, if possible) or change the signature to supply 
information on the size of the buffers.  Alternatively, use std::array for the 
first two parameters.


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@924
PS1, Line 924:   constexpr int trials = 10;
style nit here and below: use kCamelName pattern for static/constexpr constants


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@925
PS1, Line 925:   constexpr int num_writer_threads = 9;
 :   constexpr int num_reader_threads = 10;
Do you want these to be hard-coded, or there might be some extra value in 
defining these via command-line flags (with default values as they are set 
now), so one could try to see how CBTree behaves with customized settings?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@928
PS1, Line 928: esy
?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@931
PS1, Line 931: ≡
nit: does LINT and 'git check' approve using this symbol?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@938
PS1, Line 938: barrier
nit: barriers?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@947
PS1, Line 947: variables
nit: values


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@958
PS1, Line 958: for (;;)
style nit for here and below: is it possible to use 'while (true)' to match the 
rest of the code in this file and in the Kudu project overall?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1024
PS1, Line 1024: Threads
nit: threads


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1028
PS1, Line 1028: LOG_TIMING(INFO,
Does it make sense to check for failures before starting the second phase of 
the scenario?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1029
PS1, Line 1029: $2
nit: $2 values ?


http://gerrit.cloudera.org:8080/#/c/21447/1/src/kudu/tablet/cbtree-test.cc@1029
PS1, Line 1029: Threads
nit: threads



--
To view, visit http://gerrit.cloudera.org:8080/21447
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1b0b16e269c70716962fc5ebb4ddca1e2cbe68a4
Gerrit-Change-Number: 21447
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 22 May 2024 23:26:13 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.17.x) Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21451 )

Change subject: Fix deadlock on fail for CBTree-test
..

Fix deadlock on fail for CBTree-test

When TestConcurrentIterateAndInsert, TestConcurrentInsert,
TestRacyConcurrentInsert fail while --gtest_repeat is used, they
will keep running forever. Instead of just returning on fail,
they should properly stop the other threads running, and then exit.

To reproduce the problem, run this on ARM (where the test actually
fails):
./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy*

Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Reviewed-on: http://gerrit.cloudera.org:8080/21446
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
(cherry picked from commit d1bc5b53ef006ac14bca4417c4745c5875d89734)
Reviewed-on: http://gerrit.cloudera.org:8080/21451
Reviewed-by: Abhishek Chennaka 
---
M src/kudu/tablet/cbtree-test.cc
1 file changed, 4 insertions(+), 3 deletions(-)

Approvals:
  Abhishek Chennaka: Looks good to me, approved
  Alexey Serbin: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21451
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR](branch-1.17.x) Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: Fix deadlock on fail for CBTree-test
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21451
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR](branch-1.17.x) Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21451 )

Change subject: Fix deadlock on fail for CBTree-test
..


Patch Set 1: Verified+1

unrelated test failures due to:
  * somewhat polluted thirdparty workspace (it's been built for the main 
branch, not for branch-1.17.x)
  * known flakiness mostly related to Ranger's warts and farts


--
To view, visit http://gerrit.cloudera.org:8080/21451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21451
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Wed, 22 May 2024 21:12:05 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21451


Change subject: Fix deadlock on fail for CBTree-test
..

Fix deadlock on fail for CBTree-test

When TestConcurrentIterateAndInsert, TestConcurrentInsert,
TestRacyConcurrentInsert fail while --gtest_repeat is used, they
will keep running forever. Instead of just returning on fail,
they should properly stop the other threads running, and then exit.

To reproduce the problem, run this on ARM (where the test actually
fails):
./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy*

Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Reviewed-on: http://gerrit.cloudera.org:8080/21446
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
(cherry picked from commit d1bc5b53ef006ac14bca4417c4745c5875d89734)
---
M src/kudu/tablet/cbtree-test.cc
1 file changed, 4 insertions(+), 3 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/51/21451/1
--
To view, visit http://gerrit.cloudera.org:8080/21451
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21451
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR] Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21446 )

Change subject: Fix deadlock on fail for CBTree-test
..

Fix deadlock on fail for CBTree-test

When TestConcurrentIterateAndInsert, TestConcurrentInsert,
TestRacyConcurrentInsert fail while --gtest_repeat is used, they
will keep running forever. Instead of just returning on fail,
they should properly stop the other threads running, and then exit.

To reproduce the problem, run this on ARM (where the test actually
fails):
./bin/cbtree-test --gtest_repeat=100 --gtest_filter=*Racy*

Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Reviewed-on: http://gerrit.cloudera.org:8080/21446
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
---
M src/kudu/tablet/cbtree-test.cc
1 file changed, 4 insertions(+), 3 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21446
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR] Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21446 )

Change subject: Fix deadlock on fail for CBTree-test
..


Patch Set 4: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21446
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Wed, 22 May 2024 19:22:23 +
Gerrit-HasComments: No


[kudu-CR] Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21446 )

Change subject: Fix deadlock on fail for CBTree-test
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21446
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Wed, 22 May 2024 19:22:31 +
Gerrit-HasComments: No


[kudu-CR] Fix deadlock on fail for CBTree-test

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: Fix deadlock on fail for CBTree-test
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21446
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ia10d05dfdc4a12cb034450f432693f054d138498
Gerrit-Change-Number: 21446
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR] Fix row project codegen params noalias overflow

2024-05-22 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19952 )

Change subject: Fix row_project codegen params noalias overflow
..


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/19952/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/19952/5//COMMIT_MSG@9
PS5, Line 9: function->addParamAttr is 0-based indexes, current row_project 
generator IR code is:
   : `define i1 @ProjRead(i8* %src, %"class.kudu::RowBlockRow"* noalias 
%rbrow, %"class.kudu::Arena"* noalias %arena)`
> nit: alignment required
I'm not going to update this -- this patch is already merged.



--
To view, visit http://gerrit.cloudera.org:8080/19952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 19952
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 22 May 2024 13:53:41 +
Gerrit-HasComments: Yes


[kudu-CR](branch-1.17.x) Fix row project codegen params noalias overflow

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21448


Change subject: Fix row_project codegen params noalias overflow
..

Fix row_project codegen params noalias overflow

function->addParamAttr is 0-based indexes, current row_project generator IR 
code is:
`define i1 @ProjRead(i8* %src, %"class.kudu::RowBlockRow"* noalias %rbrow, 
%"class.kudu::Arena"* noalias %arena)`
not same with before.

Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Reviewed-on: http://gerrit.cloudera.org:8080/19952
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
(cherry picked from commit c485c8c3cf4b76d8a55e2ec98e27803341285f75)
---
M src/kudu/codegen/row_projector.cc
1 file changed, 1 insertion(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/48/21448/1
--
To view, visit http://gerrit.cloudera.org:8080/21448
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: newchange
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 21448
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward 


[kudu-CR] Fix row project codegen params noalias overflow

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/19952 )

Change subject: Fix row_project codegen params noalias overflow
..


Patch Set 4: Code-Review+2

Thanks a lot for the patch!

I was cleaning up my mailbox and found an email about this patch.  It's been 
almost a year!  I'm not sure how I missed this.  :)

Indeed: the newly addParamAttr() uses 0-based indices once it has been 
introduced in LLVM 5.0.0 with [1] in March 2017.

[1] 
https://github.com/llvm/llvm-project/commit/a0b45f4bfccb3ab197ca504032c28160ce82eac2


--
To view, visit http://gerrit.cloudera.org:8080/19952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 19952
Gerrit-PatchSet: 4
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 22 May 2024 05:57:05 +
Gerrit-HasComments: No


[kudu-CR] Fix row project codegen params noalias overflow

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19952 )

Change subject: Fix row_project codegen params noalias overflow
..

Fix row_project codegen params noalias overflow

function->addParamAttr is 0-based indexes, current row_project generator IR 
code is:
`define i1 @ProjRead(i8* %src, %"class.kudu::RowBlockRow"* noalias %rbrow, 
%"class.kudu::Arena"* noalias %arena)`
not same with before.

Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Reviewed-on: http://gerrit.cloudera.org:8080/19952
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M src/kudu/codegen/row_projector.cc
1 file changed, 1 insertion(+), 2 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19952
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I1dab7d46cff96ed1ebbd020584a066f04e6ca12a
Gerrit-Change-Number: 19952
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-613: Introduce new BlockCache metrics

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21389 )

Change subject: KUDU-613: Introduce new BlockCache metrics
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21389/2/src/kudu/util/block_cache_metrics.cc
File src/kudu/util/block_cache_metrics.cc:

PS2:
> I was thinking more in the terms of a single entry life journey tracking i.
I don't think we could provide a per-element metric in this context.  The way 
how it's usually achieved is maintaining a histogram-like metric that 
accumulates stats on the distribution of a gauge.

I guess you are suggesting to collect stats on the frequency of seeing a 
particular key.  But that's rather in the design of the cache, not just metrics 
of the SLRU cache.  E.g., the TinyLFU cache keeps similar stats on per-element 
basis (actually, it's closer to Bloom filter's approach) and uses that for 
admission control decisions.

But I might misunderstood your suggestion, though.  Please feel free to chime 
in and correct my misunderstanding, if any.


PS2:
> I definitely considered just providing top level metrics for the SLRU cache
Sure, we need to have detailed metrics for the parts of the SLRU cache.  
Re-using the set of LRU metrics for both the protected and the probational 
segments of the cache is quite natural and those stats provide information for 
generic troubleshooting and assessing the ratio of the segment sizes.  However, 
you could provide function gauges for the high-level cache metrics that an 
end-user might be interested in, improving the usability for an end-user.

If you prefer to address it in a separate patch, I'm OK with that approach.



--
To view, visit http://gerrit.cloudera.org:8080/21389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c8181ec6bea301605aaef7db0003c3eaef3072d
Gerrit-Change-Number: 21389
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Wed, 22 May 2024 02:59:10 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 14:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG@22
PS16, Line 22:
> This was a DEBUG build run locally on macOS. 6 cores and 2.6 GHz. I'll try
Thanks for clarifying.

We are mostly interested in RELEASE build stats in the context of performance 
evaluation.  DEBUG and sanitizer builds don't provide any meaningful stats since
nobody runs DEBUG builds in production :)

You can always create create a VM in hosted or public cloud providers, build 
Kudu from scratch there and run performance tests.

I think it's essential to have the actual stats for the lookup operations 
provides for a RELEASE build in this description.


http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h@228
PS14, Line 228:   // Sets probationary and protected segments metrics to update 
corresponding counters accordingly.
  :   //
  :   // An upgrade of one entry will increment the number of 
evictions from the probationary segment by
  :   // one and the number of insertions into the protected 
segment by one.
  :   //
  :   // A downgrade of one entry will increment the number of 
evictions from the protected segment by
  :   // one and the number of insertions into the probationary 
segment by one.
  :   //
  :   // To calculate the number of inserts for the SLRU cache, 
subtract the number of evictions
  :   // of the protected segment from the number of inserts of the 
probationary segment.
  :   //
  :   // To calculate the number of evictions for the SLRU cache, 
subtract the number of insertions
  :   // of the protected segment from the number of evictions of 
the probationary segment.
  :   //
  :   // Other than insertions and evictions, to calculate the 
metric for the SLRU cache, add the
  :   // corresponding metric from the probationary and protected 
segment together.
  :   virtual void SetSegmentMetrics(std::unique_ptr 
probationary_metrics,
  :  std::unique_ptr 
protected_metrics,
  :  ExistingMetricsPolicy 
metrics_policy) = 0;
> SetMetrics() is used in other areas of the code that use a cache (file_cach
OK, I guess one can try to circle back on the attempts to clean up this mess 
and bad smell in a follow-up patch.

In this iteration, please add a TODO for the extra methods that are specific to 
the SLRU cache.  One approach might be adding extra interface(s) and moving the 
specific methods out of the common Cache interface.

Thank you!


http://gerrit.cloudera.org:8080/#/c/20607/18/src/kudu/util/slru_cache-test.cc
File src/kudu/util/slru_cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/20607/18/src/kudu/util/slru_cache-test.cc@442
PS18, Line 442:   const int weight = static_cast(protected_segment_size_ / 
100);
It would be great to add a comment to explain what 'downgrade eviction' means 
along with short description on what essential invariants we have to track in 
this scenario and why.


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h
File src/kudu/util/slru_cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h@40
PS16, Line 40: NewSLRUCache
> I guess then the question is what are the template parameters? It makes sen
Those are the usual benefits of having common approach: this is to have more 
uniform code when instantiating cache, and it automatically helps in a 
templatized code.  It could also help with differences in the signatures of the 
constructors.

For example, one extra level of indirection to help with the different 
signatures of the constructors might be introducing CacheParameters template 
and hiding differences in the set of parameters for the constructors.  That's a 
common way of doing so -- you could check SslTypeTraits in openssl_util.h for 
an inspiration.

I guess we could keep this as-is for now, especially if you are not fond of 
getting rid of the code ugliness in this patch itself.  There might be an 
opportunity to clean this up later on, but most likely this will stay as ugly 
as its left in the first version :)


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc
File src/kudu/util/slru_cache.cc:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@57
PS16, Line 57: class SLRUCacheShard {
> It's definitely possible, however, the cache file would be even more pollut
IIUC, you could 

[kudu-CR] KUDU-2671: Update upstream docs

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21108 )

Change subject: KUDU-2671: Update upstream docs
..


Patch Set 1: Code-Review+1

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21108/1/docs/schema_design.adoc
File docs/schema_design.adoc:

http://gerrit.cloudera.org:8080/#/c/21108/1/docs/schema_design.adoc@443
PS1, Line 443: table wide
nit: table-wide ?



--
To view, visit http://gerrit.cloudera.org:8080/21108
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8da554851a124d1d357be65d8bcc2c6c37875dcc
Gerrit-Change-Number: 21108
Gerrit-PatchSet: 1
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 21 May 2024 23:00:25 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) gh-pages should not tirgger build

2024-05-21 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21404 )

Change subject: gh-pages should not tirgger build
..


Patch Set 1:

It seems the test has passed -- no build had been triggered.  Probably, it's 
time to close/abandon this patch? :)


--
To view, visit http://gerrit.cloudera.org:8080/21404
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: I50aaf71ae5bf8a288d8c7347f1fe5ecdb5143654
Gerrit-Change-Number: 21404
Gerrit-PatchSet: 1
Gerrit-Owner: Marton Greber 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Comment-Date: Tue, 21 May 2024 20:55:26 +
Gerrit-HasComments: No


[kudu-CR] [metrics] Add metrics for tablet copy op time

2024-05-17 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21356 )

Change subject: [metrics] Add metrics for tablet copy op time
..


Patch Set 8:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tablet/tablet_metrics.cc@227
PS8, Line 227: Tablet Copy Operation Duration
Would 'Tablet Copy Duration' be good enough?


http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tablet/tablet_metrics.cc@229
PS8, Line 229: on this tablet
Does it make sense to mention this is the duration as seen from the source 
tablet replica?


http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tablet/tablet_metrics.cc@229
PS8, Line 229: copy tablet
tablet copying


http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tablet/tablet_metrics.cc@231
PS8, Line 231: 6000LU
Yingchun already pointed at that in PS5, but it seems there is still room for 
improvement.

With current settings, 6000 stands for maximum value of 1 minute (60 
seconds).  Are you sure this makes sense?  I suspect that for a large tablet 
replica being copied over a slow network it might take tens of minutes to 
complete the operation, so an hour for the maximum duration would be prudent.

Also, I don't think it makes a lot of sense to have microseconds for the unit 
for the duration here.  I'd think it to be milliseconds at most.


http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tserver/tablet_copy_service-test.cc
File src/kudu/tserver/tablet_copy_service-test.cc:

http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tserver/tablet_copy_service-test.cc@237
PS8, Line 237: TEST_F(TabletCopyMetricTest, TestRunTimeMetricFinishState) {
 :   const auto before_cnt = CopyRunTime()->TotalCount();
 :   string session_id;
 :   ASSERT_OK(DoBeginValidTabletCopySession(_id));
 :
 :   EndTabletCopySessionResponsePB resp;
 :   RpcController controller;
 :   ASSERT_OK(DoEndTabletCopySession(session_id, true, nullptr, 
, ));
 :   ASSERT_EQ(before_cnt + 1, CopyRunTime()->TotalCount());
 : }
Does it make sense to verify the corresponding metric both at the destination 
tablet replica as well?  I guess it should show 0 for TotalCount(), right?


http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tserver/tablet_copy_source_session.cc
File src/kudu/tserver/tablet_copy_source_session.cc:

http://gerrit.cloudera.org:8080/#/c/21356/8/src/kudu/tserver/tablet_copy_source_session.cc@514
PS8, Line 514: int64_t elapsed_ms = (MonoTime::Now() - 
start_time_).ToMilliseconds();
 : metrics->tablet_copy_duration->Increment(elapsed_ms);
In tablet_metrics.cc, the tablet_copy_duration is defined in microseconds is 
PS8, so this code is inconsistent.



--
To view, visit http://gerrit.cloudera.org:8080/21356
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I088f6a9a8a07ad39ca95ae8b4995ce00d1a0d00c
Gerrit-Change-Number: 21356
Gerrit-PatchSet: 8
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Fri, 17 May 2024 18:14:55 +
Gerrit-HasComments: Yes


[kudu-CR] [security-flags-itest] Fix missing command line flags

2024-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21399 )

Change subject: [security-flags-itest] Fix missing command line flags
..


Patch Set 4: Code-Review+2

Thank you for the fix!


--
To view, visit http://gerrit.cloudera.org:8080/21399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Gerrit-Change-Number: 21399
Gerrit-PatchSet: 4
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Wed, 15 May 2024 15:57:32 +
Gerrit-HasComments: No


[kudu-CR] [security-flags-itest] Fix missing command line flags

2024-05-15 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21399 )

Change subject: [security-flags-itest] Fix missing command line flags
..

[security-flags-itest] Fix missing command line flags

It is a known phenomenon, that static libraries won't be included into
an executable if there are no usage for any function or variable[1].
This means the initialization routines in the library won't be executed,
even if these initialization routines have side effects, such as
registering the variable in the gflags ecosystem. As a result, the
CheckRpcAuthnFlagsGroupValidator test failed because
"rpc_authentication" flag was not registered properly. To solve this
issue, a command line variable check is added, so now the library will
be used in the executable and the initialization routines will be
executed.

[1] 
https://stackoverflow.com/questions/1229430/how-do-i-prevent-my-unused-global-variables-being-compiled-out-of-my-static-li

Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Reviewed-on: http://gerrit.cloudera.org:8080/21399
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M src/kudu/integration-tests/security-flags-itest.cc
1 file changed, 12 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Gerrit-Change-Number: 21399
Gerrit-PatchSet: 5
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [common] get rid of MutexLock

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21415 )

Change subject: [common] get rid of MutexLock
..

[common] get rid of MutexLock

Since contemporary STL library provides both std::lock_guard and
std::unique_lock, there is no need to keep MutexLock.

Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Reviewed-on: http://gerrit.cloudera.org:8080/21415
Tested-by: Alexey Serbin 
Reviewed-by: Abhishek Chennaka 
---
M src/kudu/clock/builtin_ntp.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/rpc/service_queue.h
M src/kudu/server/diagnostics_log.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/table_scanner.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_service.h
M src/kudu/util/async_logger.cc
M src/kudu/util/async_logger.h
M src/kudu/util/barrier.h
M src/kudu/util/blocking_queue-test.cc
M src/kudu/util/blocking_queue.h
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/countdown_latch.h
M src/kudu/util/debug/trace_event_impl.cc
M src/kudu/util/debug/trace_event_synthetic_delay.cc
M src/kudu/util/kernel_stack_watchdog.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
M src/kudu/util/mutex.h
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/rwc_lock.cc
M src/kudu/util/test_graph.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
33 files changed, 146 insertions(+), 197 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Abhishek Chennaka: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 5
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21414 )

Change subject: [common] switch from unique_lock to lock_guard
..

[common] switch from unique_lock to lock_guard

To simplify working with STL synchronization primitives, this patch
changes from std::unique_lock to std::lock_guard where appropriate.
In addition, IWYU and ClangTidy's feedback has been addressed.

Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Reviewed-on: http://gerrit.cloudera.org:8080/21414
Tested-by: Alexey Serbin 
Reviewed-by: Zoltan Chovan 
Reviewed-by: Abhishek Chennaka 
---
M src/kudu/clock/hybrid_clock.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/log_cache.cc
M src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/rpc/rpcz_store.cc
M src/kudu/security/tls_context.cc
M src/kudu/server/webserver.cc
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/scanners.cc
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/trace_metrics.h
14 files changed, 36 insertions(+), 31 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Zoltan Chovan: Looks good to me, but someone else must approve
  Abhishek Chennaka: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [metrics] Add tablet level metrics for scans op time

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21291 )

Change subject: [metrics] Add tablet level metrics for scans op time
..


Patch Set 4:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tablet/tablet_metrics.h
File src/kudu/tablet/tablet_metrics.h:

http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tablet/tablet_metrics.h@68
PS4, Line 68:   scoped_refptr tablet_scans_duration_wall_time;
:   scoped_refptr tablet_scans_duration_system_time;
That's interesting: the system and the wall times are collected, but not the 
user time.  Why?


http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tablet/tablet_metrics.cc
File src/kudu/tablet/tablet_metrics.cc:

http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tablet/tablet_metrics.cc@160
PS4, Line 160: kInfo
Here and below: I'm not sure this metric is anything but extra information 
which might be needed for troubleshooting purposes only: for each active 
scanner, corresponding information is already displayed at the 'Dashboard' tab 
of the embedded WebUI.

With that, I'd think of kDebug level, not kInfo.


http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/21291/4/src/kudu/tserver/tablet_service.cc@3215
PS4, Line 3215: auto elapsed = scanner_timer.GetElapsed();
There are several earlier returns from this method where scanner times are 
updated automatically by the scoped-like construct ScopedAddScannerTiming.  
However, the metrics aren't updated accordingly.  Why?



--
To view, visit http://gerrit.cloudera.org:8080/21291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8f490cfb6f37aef60b34697100fb502374fcc503
Gerrit-Change-Number: 21291
Gerrit-PatchSet: 4
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Tue, 14 May 2024 22:15:37 +
Gerrit-HasComments: Yes


[kudu-CR] [metrics] Add metrics for participant op time

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21375 )

Change subject: [metrics] Add metrics for participant op time
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21375/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21375/4//COMMIT_MSG@9
PS4, Line 9: Add tablet-level statistics to track the time consumption of
   : participant operations.
Have you ever used multi-row transactions in one of your Kudu clusters?

If talking about TXN participant meta-operations, I'd recommend introducing new 
metrics based on particular needs and observations from the field, not just 
blindly adding a metric to fill in the blank spots.

One source of strange smell in this patch is having an umbrella timings metric 
for very different types of TXN state transitions.  AFAIK, the timings for 
FINALIZE_COMMIT and ABORT_TXN might be different order of magnitude compared 
with BEGIN_TXN due to their nature, and accumulating them under the same 
umbrella looks like quite a strange idea to me.



--
To view, visit http://gerrit.cloudera.org:8080/21375
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie25d4bee27e9ba35925634b61fb518467bb59201
Gerrit-Change-Number: 21375
Gerrit-PatchSet: 4
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Tue, 14 May 2024 21:47:29 +
Gerrit-HasComments: Yes


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21414 )

Change subject: [common] switch from unique_lock to lock_guard
..


Patch Set 2:

> (1 comment)

One more reference that might be useful in this context: 
https://stackoverflow.com/questions/43019598/stdlock-guard-or-stdscoped-lock


--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Tue, 14 May 2024 21:22:53 +
Gerrit-HasComments: No


[kudu-CR] [common] get rid of MutexLock

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [common] get rid of MutexLock
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [common] get rid of MutexLock

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21415 )

Change subject: [common] get rid of MutexLock
..


Patch Set 4: Verified+1

unrelated test failures, posted a fix to address at least one of them: 
https://gerrit.cloudera.org/#/c/21427/


--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Tue, 14 May 2024 21:21:02 +
Gerrit-HasComments: No


[kudu-CR] [alter table-test] Increase assertion timeout

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21347 )

Change subject: [alter_table-test] Increase assertion timeout
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG@10
PS1, Line 10: So timeout was increased to 60 seconds.
> This looks like a different issue than KUDU-3509.
You might need to update the build-support/dist_test.py to make it possible to 
start dist-test from binaries build at Ubuntu 22.04.  If you want to run the 
test via dist-test, just build it on something like ubuntu18 or CentOS7 -- 
that's known to work.

Otherwise, you could run the test locally with many iterations (i.e. setting 
--gtest_iterations=1000 or alike), and maybe adding extra flags to make the 
issue manifest itself more often (e.g., adding --stress_cpu_threads=8 or 
alike).  The idea is to get the stats on the flakiness before and after the fix 
to make sure this patch addresses the issue.



--
To view, visit http://gerrit.cloudera.org:8080/21347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc51365a798569c22e680eb3e7723a5f330d43c5
Gerrit-Change-Number: 21347
Gerrit-PatchSet: 1
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 14 May 2024 21:14:11 +
Gerrit-HasComments: Yes


[kudu-CR] [tablet] fix race accessing OpState's start time

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21427


Change subject: [tablet] fix race accessing OpState's start time
..

[tablet] fix race accessing OpState's start time

This patch addresses a race reported by TSAN with traces like below:

WARNING: ThreadSanitizer: data race (pid=11024)
  Write of size 8 at 0x7b580011f260 by thread T174:
#0 kudu::tablet::OpState::set_start_time(kudu::MonoTime) 
src/kudu/tablet/ops/op.h:274:58
#1 kudu::tablet::WriteOp::Start() src/kudu/tablet/ops/write_op.cc:273:11
#2 kudu::tablet::OpDriver::Prepare() src/kudu/tablet/ops/op_driver.cc:329:7
#3 kudu::tablet::OpDriver::PrepareTask() 
src/kudu/tablet/ops/op_driver.cc:249:31
...

  Previous read of size 8 at 0x7b580011f260 by thread T5 (mutexes: write 
M835553159786377312):
#0 kudu::tablet::OpState::start_time() const src/kudu/tablet/ops/op.h:272:40
#1 kudu::tablet::WriteOp::ToString() const 
src/kudu/tablet/ops/write_op.cc:378:36
#2 kudu::tablet::OpDriver::ToStringUnlocked() const 
src/kudu/tablet/ops/op_driver.cc:209:23
#3 kudu::tablet::OpDriver::ToString() const 
src/kudu/tablet/ops/op_driver.cc:203:10
#4 kudu::tablet::TabletReplica::GetInFlightOps(...) const 
src/kudu/tablet/tablet_replica.cc:728:41
#5 kudu::tserver::TabletServerPathHandlers::HandleTransactionsPage(...) 
src/kudu/tserver/tserver_path_handlers.cc:286:14
...

Change-Id: I52de0840aa20f64cf15c7a9da2d553257c7e85e7
---
M src/kudu/tablet/ops/op.h
1 file changed, 1 insertion(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21427/1
--
To view, visit http://gerrit.cloudera.org:8080/21427
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I52de0840aa20f64cf15c7a9da2d553257c7e85e7
Gerrit-Change-Number: 21427
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [security-flags-itest] Fix missing command line flags

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21399 )

Change subject: [security-flags-itest] Fix missing command line flags
..


Patch Set 3: Code-Review+1

> Please add a proper description of the problem and the solution.
 >
 > + Mention that the test fails because the flag is not registered
 > properly.
 > + The flag is defined in server_process.a. We do not use any
 > function or variable from it.
 > + It is a known C++ phenomenon that the INITIALIZATION ROUTINES of
 > the library might not be included if no symbol is used from the
 > library (despite the initialization routines having an effect on
 > used functionality, such as registering the variable in the gflags
 > ecosystem).
 >
 > Also, --whole-archive fixes the issue and works perfectly if you
 > use it only for the problematic library. The first google result
 > for "how to add --whole-archive in cmake" works. You are doing
 > something wrong...

I'm OK with the current solution.  I'm not sure that it's worth bringing in the 
--whole-archive alternative: we do have the LTO optimizations turned on, so it 
would be necessary to clarify how it affects the other binaries where the util 
library is linked into.


--
To view, visit http://gerrit.cloudera.org:8080/21399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Gerrit-Change-Number: 21399
Gerrit-PatchSet: 3
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 14 May 2024 17:51:01 +
Gerrit-HasComments: No


[kudu-CR] [security-flags-itest] Fix missing command line flags

2024-05-14 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21399 )

Change subject: [security-flags-itest] Fix missing command line flags
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Gerrit-Change-Number: 21399
Gerrit-PatchSet: 3
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 14 May 2024 17:47:38 +
Gerrit-HasComments: No


[kudu-CR] [common] get rid of MutexLock

2024-05-14 Thread Alexey Serbin (Code Review)
Hello Marton Greber, Zoltan Chovan, Attila Bukor, Kudu Jenkins, Abhishek 
Chennaka, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21415

to look at the new patch set (#4).

Change subject: [common] get rid of MutexLock
..

[common] get rid of MutexLock

Since contemporary STL library provides both std::lock_guard and
std::unique_lock, there is no need to keep MutexLock.

Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
---
M src/kudu/clock/builtin_ntp.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/rpc/service_queue.h
M src/kudu/server/diagnostics_log.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/table_scanner.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_service.h
M src/kudu/util/async_logger.cc
M src/kudu/util/async_logger.h
M src/kudu/util/barrier.h
M src/kudu/util/blocking_queue-test.cc
M src/kudu/util/blocking_queue.h
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/countdown_latch.h
M src/kudu/util/debug/trace_event_impl.cc
M src/kudu/util/debug/trace_event_synthetic_delay.cc
M src/kudu/util/kernel_stack_watchdog.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
M src/kudu/util/mutex.h
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/rwc_lock.cc
M src/kudu/util/test_graph.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
33 files changed, 146 insertions(+), 197 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/21415/4
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [common] get rid of MutexLock

2024-05-14 Thread Alexey Serbin (Code Review)
Hello Marton Greber, Zoltan Chovan, Attila Bukor, Kudu Jenkins, Abhishek 
Chennaka, Wang Xixu,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21415

to look at the new patch set (#3).

Change subject: [common] get rid of MutexLock
..

[common] get rid of MutexLock

Since contemporary STL library provides both std::lock_guard and
std::unique_lock, there is no need to keep MutexLock.

Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
---
M src/kudu/clock/builtin_ntp.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/rpc/service_queue.h
M src/kudu/server/diagnostics_log.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/table_scanner.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_service.h
M src/kudu/util/async_logger.cc
M src/kudu/util/async_logger.h
M src/kudu/util/barrier.h
M src/kudu/util/blocking_queue-test.cc
M src/kudu/util/blocking_queue.h
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/countdown_latch.h
M src/kudu/util/debug/trace_event_impl.cc
M src/kudu/util/debug/trace_event_synthetic_delay.cc
M src/kudu/util/kernel_stack_watchdog.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
M src/kudu/util/mutex.h
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/rwc_lock.cc
M src/kudu/util/test_graph.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
33 files changed, 144 insertions(+), 197 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/21415/3
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 


[kudu-CR] [common] get rid of MutexLock

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21415 )

Change subject: [common] get rid of MutexLock
..


Patch Set 2:

> > > Patch Set 2: Code-Review+1
 > > >
 > > > My only concern/nit about this change is wether we're loosing a
 > > bit of code readability, many places a simple "lock" is used for
 > > name, and the information about the type of the lock is a bit
 > > harder to find out (e.g. have to look up the header file and find
 > > the lock declaration to see the lock type).
 > > > It's not a huge issue, and renaming every instance to more
 > > specific names seems excessive, just wanted to bring it up and
 > hear
 > > your thoughts.
 > >
 > > I think I have to clarify a bit, I meant that using
 > > 'decltype(insert-lock-name)' hides the lock type information, but
 > I
 > > might be just splitting hairs here :)
 >
 > That information in one click away if using an IDE (or one
 > key-press away if using YCM or other editors plugins).  Yes, it's
 > no longer

Yes, the type information is no longer duplicated as it used to be, but I don't 
see it as something negative.  Take a look at other class members used around 
in the same .cc file: they have information on their type only in the .h header 
file where they are declared/defined, and everybody just uses the members in 
the .cc file without any problem.


--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Tue, 14 May 2024 05:43:37 +
Gerrit-HasComments: No


[kudu-CR] [common] get rid of MutexLock

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21415 )

Change subject: [common] get rid of MutexLock
..


Patch Set 2:

> > Patch Set 2: Code-Review+1
 > >
 > > My only concern/nit about this change is wether we're loosing a
 > bit of code readability, many places a simple "lock" is used for
 > name, and the information about the type of the lock is a bit
 > harder to find out (e.g. have to look up the header file and find
 > the lock declaration to see the lock type).
 > > It's not a huge issue, and renaming every instance to more
 > specific names seems excessive, just wanted to bring it up and hear
 > your thoughts.
 >
 > I think I have to clarify a bit, I meant that using
 > 'decltype(insert-lock-name)' hides the lock type information, but I
 > might be just splitting hairs here :)

That information in one click away if using an IDE (or one key-press away if 
using YCM or other editors plugins).  Yes, it's no longer


--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Comment-Date: Tue, 14 May 2024 05:25:36 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3568 Fix compaction budgeting test by setting memory hard limit

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21416 )

Change subject: KUDU-3568 Fix compaction budgeting test by setting memory hard 
limit
..

KUDU-3568 Fix compaction budgeting test by setting memory hard limit

TestRowSetCompactionSkipWithBudgetingConstraints can fail if the
memory on node running the test is high. It happens because the test
generates deltas of size worth a few MBs that is multiplied with a
preset factor to ensure the result (i.e. memory required for rowset
compaction completion) is of high value of the order of 200 GB per
rowset.

Even though nodes running the test generally don't have so much
physical memory, it is still possible to end up with high memory nodes.
On such nodes, the test might fail.

The patch fixes that problem by deterministically ensuring that
compaction memory requirement is always higher than the memory hard
limit. It does that by doing the following:
1. Move out the budgeting compaction tests out in a separate binary.
2. This gives flexibility to set the memory hard limit as per test
   needs. It is important to node that once a memory hard limit is
   set, it remains the same for all tests executed through
   binary lifecycle.
3. Set the hard memory limit to 1 GB which is enough to handle compaction
   requirements for TestRowSetCompactionProceedWithNoBudgetingConstraints.
   For TestRowSetCompactionSkipWithBudgetingConstraints, it is not
   enough because we set the delta memory factor high to exceed 1 GB.
   Both the test are now expected to succeed deterministically.

Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Reviewed-on: http://gerrit.cloudera.org:8080/21416
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
---
M src/kudu/tablet/CMakeLists.txt
A src/kudu/tablet/compaction-highmem-test.cc
M src/kudu/tablet/compaction-test.cc
3 files changed, 221 insertions(+), 143 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Gerrit-Change-Number: 21416
Gerrit-PatchSet: 3
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3568 Fix compaction budgeting test by setting memory hard limit

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21416 )

Change subject: KUDU-3568 Fix compaction budgeting test by setting memory hard 
limit
..


Patch Set 2: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Gerrit-Change-Number: 21416
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 13 May 2024 16:01:51 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3568 Fix compaction budgeting test by setting memory hard limit

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3568 Fix compaction budgeting test by setting memory hard 
limit
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Gerrit-Change-Number: 21416
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] KUDU-3568 Fix compaction budgeting test by setting memory hard limit

2024-05-13 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21416 )

Change subject: KUDU-3568 Fix compaction budgeting test by setting memory hard 
limit
..


Patch Set 2: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Gerrit-Change-Number: 21416
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 13 May 2024 16:01:48 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3568 Fix compaction budgeting test by setting memory hard limit

2024-05-11 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21416 )

Change subject: KUDU-3568 Fix compaction budgeting test by setting memory hard 
limit
..


Patch Set 1: Code-Review+1

(5 comments)

overall looks good to me, just a few nits

http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc
File src/kudu/tablet/compaction-highmem-test.cc:

http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc@18
PS1, Line 18: #include 
nit: please use C++ style headers, i.e. 


http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc@142
PS1, Line 142: (1024 * 1024 * 1024)
nit: drop the parentheses?


http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc@142
PS1, Line 142:   FLAGS_memory_limit_hard_bytes = (1024 * 1024 * 1024);
 :
 :   FLAGS_rowset_compaction_ancient_delta_threshold_enabled = true;
 :   FLAGS_rowset_compaction_enforce_preset_factor = true;
 :   FLAGS_rowset_compaction_memory_estimate_enabled = true;
 :
 :   // Ensure memory budgeting applies
 :   FLAGS_rowset_compaction_estimate_min_deltas_size_mb = 0;
Since this is in a dedicated binary now, consider setting these flags just once 
in per-suite static setup function called SetUpTestSuite().  For details, see 
https://google.github.io/googletest/advanced.html#sharing-resources-between-tests-in-the-same-test-suite


http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc@166
PS1, Line 166: sw.elapsed().ToString(),
 : trace->MetricsAsJSON());
nit: please fix the indent


http://gerrit.cloudera.org:8080/#/c/21416/1/src/kudu/tablet/compaction-highmem-test.cc@179
PS1, Line 179:   const uint32_t num_rowsets = 10;
 :   const uint32_t num_rows_per_rowset = 2;
 :   const uint32_t num_updates = 5000 * size_factor;
nit: could make these constexpr ?



--
To view, visit http://gerrit.cloudera.org:8080/21416
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I85d104e1d066507ce8e72a00cc5165cc4b85e48d
Gerrit-Change-Number: 21416
Gerrit-PatchSet: 1
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 11 May 2024 16:51:14 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc
File src/kudu/util/slru_cache.cc:

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@543
PS17, Line 543:   // Release from either the probationary or the protected 
shard.
  :   if (!e->protected_segment) {
  : probationary_shard_->Release(handle);
  :   } else {
  : protected_shard_->Release(handle);
  :   }
  : }
What guards against the following scenarios:

Scenario_0:
 * thread_A: an instance of Cache::UniqueHandle as a wrapper for entry_X is 
being destroyed
 * thread_A: while executing SLRUCacheShardPair::Release(),  finds entry_X 
isn't in the protected shard
 * thread_B: calls SLRUCacheShardPair::Lookup(), moving entry_X into the 
protected shard
 * thread_A: calls Release() on the corresponding probationary shard for entry_X

Scenario_1:
 * thread_A: an instance of Cache::UniqueHandle as a wrapper for entry_X is 
being destroyed
 * thread_A: while executing SLRUCacheShardPair::Release(), finds entry_X is in 
the protected shard
 * thread_B: calls SLRUCacheShardPair::Lookup() for some other entry_Y which is 
in the probationary shard, but due to crossing the threshold of lookup 
operations per entry, it's now being moved into the protected shard, so entry_X 
is pushed out of the protected shard
 * thread_A: calls Release() on the corresponding protected shard for entry_X

If nothing prevents from having such scenarios, then a few questions:
 * Does the contents SLRU cache stay consistent after that?
 * Do metrics of the SLRU cache stay consistent after that?



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 17
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 11 May 2024 01:09:24 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 16:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc
File src/kudu/util/slru_cache.cc:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@254
PS16, Line 254:  e = table_.Lookup(key, hash);
What guarantees do we have to be sure that 'table_' cannot be modified under 
the hood by concurrently running SLRUCacheShard::Insert()?  It would be great 
to add a note about that.


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.cc@267
PS16, Line 267:   e = table_.Lookup(key, hash);
ditto: what if Insert() is running concurrently and modifying the 'table_' -- 
what protects us from such concurrency issues?  Would be great to add a note in 
the code.


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc
File src/kudu/util/slru_cache.cc:

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@273
PS17, Line 273:   bool last_reference = Unref(e);
  :   if (last_reference) {
nit: could these be joined, adding a comment on what's the intention of this 
code?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@471
PS17, Line 471:   {
What is the purpose of this extra scope?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@501
PS17, Line 501:   {
What is the purpose of this extra scope?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@552
PS17, Line 552:   {
What is the purpose of this extra scope?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@561
PS17, Line 561:   {
What is the purpose of this extra scope?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@581
PS17, Line 581:
nit:


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@601
PS17, Line 601: // Similar to STLDeleteElements().
  : auto begin = shards_.begin();
  : while (begin != shards_.end()) {
  :   auto temp = begin;
  :   ++begin;
  :   temp->reset(nullptr);
  : }
Is this really needed?  Wouldn't the contained unique_ptr be destroyed 
automatically upon the destruction of the containing vector?


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/slru_cache.cc@694
PS17, Line 694: uint8_t* data = reinterpret_cast(h);
  : delete [] data;
nit: could these lines be joined?



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 16
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 10 May 2024 21:33:01 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 17:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h@520
PS17, Line 520:   // Returns true if probationary segment contains key.
  :   // Returns false if not.
  :   // Only used for SLRU cache.
  :   virtual bool ProbationaryContains(const Slice& key) = 0;
  :
  :   // Returns true if protected segment contains key.
  :   // Returns false if not.
  :   // Only used for SLRU cache.
  :   virtual bool ProtectedContains(const Slice& key) = 0;
> Similar to SetSegmentMetrics(), this doesn't seem to be a place for these m
SLRUCacheShard is the right place for these, even without any extras, I don't 
see any reason to keep these specific methods in this generic interface (but I 
might be missing something).



--
To view, visit http://gerrit.cloudera.org:8080/20607
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I45531534a2049dd38c002f4dc7982df9fd46e0bb
Gerrit-Change-Number: 20607
Gerrit-PatchSet: 17
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 10 May 2024 20:57:44 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce SLRU cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20607 )

Change subject: KUDU-613: Introduce SLRU cache
..


Patch Set 17:

(16 comments)

A few more high-level observations.

http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20607/16//COMMIT_MSG@22
PS16, Line 22: Lookups/sec
It would be nice to mentions the following details:
  * What type of build it was?
  * What sort of machine it was: CPU cores and frequency?

It's crucial to measure this for RELEASE builds, we aren't interested to look 
at other types of builds here, I guess.


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc
File src/kudu/util/cache-bench.cc:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache-bench.cc@92
PS16, Line 92:   ret+= " SLRU";
nit for here and elsewhere: separate variable and the operator by space


http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/17/src/kudu/util/cache.h@520
PS17, Line 520:   // Returns true if probationary segment contains key.
  :   // Returns false if not.
  :   // Only used for SLRU cache.
  :   virtual bool ProbationaryContains(const Slice& key) = 0;
  :
  :   // Returns true if protected segment contains key.
  :   // Returns false if not.
  :   // Only used for SLRU cache.
  :   virtual bool ProtectedContains(const Slice& key) = 0;
Similar to SetSegmentMetrics(), this doesn't seem to be a place for these 
methods.


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@58
PS16, Line 58: Protects against random/long reads polluting cache.
nit: consider either making this part of the comment more generic (as it should 
be at this level) or dropping this part altogether


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@91
PS16, Line 91: protected_segment
nit: should be rather named 'in_protected_segment'?


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/cache.h@116
PS16, Line 116: sanitize
style nit: since this isn't an accessor-like method, its name should be 
capitalized


http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h
File src/kudu/util/cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/14/src/kudu/util/cache.h@228
PS14, Line 228: // ensure this is safe, e.g. by destructing the entity that 
owned the
  : // original metrics.
  : kReset,
  :   };
  :   // Set the cache metrics to update corresponding counters 
accordingly.
  :   virtual void SetMetrics(std::unique_ptr metrics,
  :   ExistingMetricsPolicy metrics_policy) 
= 0;
  :
  :   // Sets probationary and protected segments metrics to update 
corresponding counters accordingly.
  :   //
  :   // An upgrade of one entry will increment the number of 
evictions from the probationary segment by
  :   // one and the number of insertions into the protected 
segment by one.
  :   //
  :   // A downgrade of one entry will increment the number of 
evictions from the protected segment by
  :   // one and the number of insertions into the probationary 
segment by one.
  :   //
  :   // To calculate the number of inserts for the SLRU cache, 
subtract the number of evictions
  :   // of the protected segment from the number of inserts of the 
probationary segment.
  :   //
> This will be used in the block cache to set the metrics for both segments o
OK.  But even with that, instead of polluting the interface of the base Cache 
class you could make the BlockCache class to be a template by the type of the 
underlying cache implementation, and remove both SetMetrics() and 
SetSegmentMetrics() from the common Cache interface.


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h
File src/kudu/util/nvm_cache.h:

http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/nvm_cache.h@46
PS16, Line 46:   // Number of lookups for this entry. Used for upgrading to 
protected segment in SLRU cache.
 :   // Resets to 0 when moved between probationary and protected 
segments in both directions.
 :   uint32_t lookups;
Is this really needed?  I didn't see NVM-based implementation of SLRU cache, so 
this looks a bit strange given you decided not to converge on {S}LRUHandle 
structure


http://gerrit.cloudera.org:8080/#/c/20607/16/src/kudu/util/slru_cache.h
File src/kudu/util/slru_cache.h:


[kudu-CR] KUDU-613: Integrate SLRU cache into block cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21390 )

Change subject: KUDU-613: Integrate SLRU cache into block cache
..


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc
File src/kudu/cfile/block_cache.cc:

http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@235
PS2, Line 235: BlockCache::BlockCache()
 : : BlockCache(FLAGS_block_cache_capacity_mb * 1024 * 1024) {
 : }
Interesting, but why this constructor isn't updated to take care of different 
type of the underlying cache?  That's especially strange given the funny code 
below for the next constructor when the 'capacity' parameter is ignored upon 
using the SLRU type of the cache.


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@242
PS2, Line 242:  
CreateCache(FLAGS_block_cache_probationary_segment_capacity_mb * 1024 * 1024,
 :  
FLAGS_block_cache_protected_segment_capacity_mb * 1024 * 1024,
 :  
FLAGS_block_cache_lookups_before_upgrade)) {
That's a bad smell -- how 'capacity' is then used in this branch?


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@247
PS2, Line 247: BlockCache::BlockCache(size_t probationary_segment_capacity,
 :size_t protected_segment_capacity,
 :size_t lookups)
 : : cache_(CreateCache(probationary_segment_capacity, 
protected_segment_capacity, lookups)) {
 : }
With this code and the code above it smells even worse -- now there is a 
complete mess with trying to understand when a particular type of cache is 
created based on the constructor invoked and the --block_cache_eviction_policy 
setting.



--
To view, visit http://gerrit.cloudera.org:8080/21390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I04411ab2756045f15a272f3397d46d871b087b03
Gerrit-Change-Number: 21390
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Fri, 10 May 2024 18:29:02 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Integrate SLRU cache into block cache

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21390 )

Change subject: KUDU-613: Integrate SLRU cache into block cache
..


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/21390/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21390/2//COMMIT_MSG@13
PS2, Line 13: The default
: value is still the old LRU cache.
I think it's crucial to enable at least a few integration-style scenarios using 
the block cache backed by the new SLRU  implementation.

I'd think of scenarios in tablet_server_stress-test.cc, 
full_stack-insert-scan-test.cc, dense_node-itest.cc and 
heavy-update-compaction-itest.cc


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc
File src/kudu/cfile/block_cache.cc:

http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@46
PS2, Line 46: Either 'LRU' (default) or 'SLRU' (experimental
Please add a validator for the value, and make this flag case-insensitive, so 
"LRU" and "lru" both would work (as well as "lRu").


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@46
PS2, Line 46:
nit: drop the trailing space?


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@49
PS2, Line 49: block_cache_probationary_segment_capacity_mb
For this the block_cache_protected_segment_capacity_mb flag below: Does it make 
sense to add a validator for this flag to reject at least negative values?  
From the other side, why not to use unsigned type for this flag?


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@50
PS2, Line 50: MB
here and below: is this actually MiB/MiBytes?


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@58
PS2, Line 58: ,"
nit: reconsider the punctuation and the result message for the flag


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@171
PS2, Line 171:  "Lower 
--block_cache_probationary_segment_capacity_mb and"
nit: add space after 'and'


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@194
PS2, Line 194: FLAGS_block_cache_eviction_policy
Please make sure the value of this flag is treated as case-insensitive.  Doing 
that in just one place and then reusing the function that converts a string 
into a enumeration should be robust enough.


http://gerrit.cloudera.org:8080/#/c/21390/2/src/kudu/cfile/block_cache.cc@222
PS2, Line 222: ToUpperCase(FLAGS_block_cache_eviction_policy, 
_block_cache_eviction_policy);
I'm not sure it's a good idea to update the actual value of the flag like this: 
it might quite unexpected behavior from the user's perspective.  Why not to use 
a temporary variable for converting to the upper case?



--
To view, visit http://gerrit.cloudera.org:8080/21390
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I04411ab2756045f15a272f3397d46d871b087b03
Gerrit-Change-Number: 21390
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Fri, 10 May 2024 17:56:59 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-613: Introduce new BlockCache metrics

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21389 )

Change subject: KUDU-613: Introduce new BlockCache metrics
..


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/21389/2/src/kudu/util/block_cache_metrics.cc
File src/kudu/util/block_cache_metrics.cc:

PS2:
> The new metrics give us information on inserts into the protected segment o
>From the usability perspective, I'd think of all the metrics that allow for 
>computing SLRU cache's hit and miss ratios and show how many inserts and 
>evictions happened.  That's what end-user is about to look at.  Yes, most 
>likely it's possible to deduce that from counters available for the 
>probational and protected segments (but I haven't verified that), but where 
>one can find the rules to perform such computations and how many mistakes a 
>person is going to make when trying to compute those?

I'd think of providing top-level metrics for the SLRU cache as we currently 
have for the LRU cache just for better usability.


http://gerrit.cloudera.org:8080/#/c/21389/2/src/kudu/util/block_cache_metrics.cc@176
PS2, Line 176:
nit for here and below: this isn't aligned with current Kudu style guide



--
To view, visit http://gerrit.cloudera.org:8080/21389
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1c8181ec6bea301605aaef7db0003c3eaef3072d
Gerrit-Change-Number: 21389
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Fri, 10 May 2024 17:50:01 +
Gerrit-HasComments: Yes


[kudu-CR] [master] fix race in auto leader rebalancing

2024-05-10 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21417 )

Change subject: [master] fix race in auto leader rebalancing
..

[master] fix race in auto leader rebalancing

It turned out that auto leader rebalancing task wasn't explicitly
shutdown upon shutting down catalog manager.  That lead to race
conditions as reported by TSAN, at least in test scenarios (see below).
This patch addresses the issue.

  WARNING: ThreadSanitizer: data race (pid=23827)
Write of size 1 at 0x7b408208 by main thread:
  #0 AnnotateRWLockDestroy 
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp:264
 (auto_rebalancer-test+0x33575e)
  #1 kudu::rw_spinlock::~rw_spinlock() src/kudu/util/locks.h:89:5 
(libmaster.so+0x359376)
  #2 kudu::master::TSManager::~TSManager() 
src/kudu/master/ts_manager.cc:108:1 (libmaster.so+0x4ad201)
  #3 kudu::master::TSManager::~TSManager() 
src/kudu/master/ts_manager.cc:107:25 (libmaster.so+0x4ad229)
  #4 
std::__1::default_delete::operator()(kudu::master::TSManager*)
 const thirdparty/installed/tsan/include/c++/v1/memory:2262:5 
(libmaster.so+0x407ce7)
  #5 std::__1::unique_ptr 
>::reset(kudu::master::TSManager*) 
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x40157d)
  #6 std::__1::unique_ptr >::~unique_ptr() 
thirdparty/installed/tsan/include/c++/v1/memory:2471:19 (libmaster.so+0x4015eb)
  #7 kudu::master::Master::~Master() src/kudu/master/master.cc:263:1 
(libmaster.so+0x3f7a4a)
  #8 kudu::master::Master::~Master() src/kudu/master/master.cc:261:19 
(libmaster.so+0x3f7dc9)
  #9 
std::__1::default_delete::operator()(kudu::master::Master*)
 const thirdparty/installed/tsan/include/c++/v1/memory:2262:5 
(libmaster.so+0x435627)
  #10 std::__1::unique_ptr >::reset(kudu::master::Master*) 
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x42e6ed)
  #11 kudu::master::MiniMaster::Shutdown() 
src/kudu/master/mini_master.cc:120:13 (libmaster.so+0x4c2612)
...
Previous atomic write of size 4 at 0x7b408208 by thread T439 (mutexes: 
write M1141235379631443968):
  #0 __tsan_atomic32_compare_exchange_strong 
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp:780
 (auto_rebalancer-test+0x33eb60)
  #1 base::subtle::Release_CompareAndSwap(int volatile*, int, int) 
/src/kudu/gutil/atomicops-internals-tsan.h:88:3 (libmaster.so+0x2e2b34)
  #2 kudu::rw_semaphore::unlock_shared() src/kudu/util/rw_semaphore.h:91:19 
(libmaster.so+0x2e29c8)
  #3 kudu::rw_spinlock::unlock_shared() src/kudu/util/locks.h:99:10 
(libmaster.so+0x2e28ef)
  #4 std::__1::shared_lock::~shared_lock() 
/thirdparty/installed/tsan/include/c++/v1/shared_mutex:369:19 
(libmaster.so+0x2e23e0)
  #5 
kudu::master::TSManager::GetAllDescriptors(std::__1::vector,
 std::__1::allocator > >*) 
const src/kudu/master/ts_manager.cc:206:1 (libmaster.so+0x4adeb6)
  #6 kudu::master::AutoLeaderRebalancerTask::RunLeaderRebalancer() 
src/kudu/master/auto_leader_rebalancer.cc:405:16 (libmaster.so+0x2fb51b)
  #7 kudu::master::AutoLeaderRebalancerTask::RunLoop() 
src/kudu/master/auto_leader_rebalancer.cc:445:7 (libmaster.so+0x2fbaa9)

This is a follow-up to 10efaf2c77dfe5e4474505e0267c583c011703be.

Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Reviewed-on: http://gerrit.cloudera.org:8080/21417
Reviewed-by: Wang Xixu <1450306...@qq.com>
Tested-by: Alexey Serbin 
Reviewed-by: Yifan Zhang 
---
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
2 files changed, 9 insertions(+), 1 deletion(-)

Approvals:
  Wang Xixu: Looks good to me, but someone else must approve
  Alexey Serbin: Verified
  Yifan Zhang: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Gerrit-Change-Number: 21417
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] [master] fix race in auto leader rebalancing

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21417 )

Change subject: [master] fix race in auto leader rebalancing
..


Patch Set 1: Verified+1

unrelated failure in java tests


--
To view, visit http://gerrit.cloudera.org:8080/21417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Gerrit-Change-Number: 21417
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Fri, 10 May 2024 03:04:31 +
Gerrit-HasComments: No


[kudu-CR] [master] fix race in auto leader rebalancing

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [master] fix race in auto leader rebalancing
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Gerrit-Change-Number: 21417
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Reviewer: Yingchun Lai 


[kudu-CR] [common] get rid of MutexLock

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21415 )

Change subject: [common] get rid of MutexLock
..


Patch Set 2: Verified+1

unrelated test failures: raft_consensus-itest and Java tests


--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Comment-Date: Thu, 09 May 2024 23:35:48 +
Gerrit-HasComments: No


[kudu-CR] [common] get rid of MutexLock

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [common] get rid of MutexLock
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 


[kudu-CR] [common] get rid of MutexLock

2024-05-09 Thread Alexey Serbin (Code Review)
Hello Marton Greber, Attila Bukor, Kudu Jenkins, Abhishek Chennaka,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21415

to look at the new patch set (#2).

Change subject: [common] get rid of MutexLock
..

[common] get rid of MutexLock

Since contemporary STL library provides both std::lock_guard and
std::unique_lock, there is no need to keep MutexLock.

Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
---
M src/kudu/clock/builtin_ntp.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/rpc/service_queue.h
M src/kudu/server/diagnostics_log.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/table_scanner.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_service.h
M src/kudu/util/async_logger.cc
M src/kudu/util/async_logger.h
M src/kudu/util/barrier.h
M src/kudu/util/blocking_queue-test.cc
M src/kudu/util/blocking_queue.h
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/countdown_latch.h
M src/kudu/util/debug/trace_event_impl.cc
M src/kudu/util/debug/trace_event_synthetic_delay.cc
M src/kudu/util/kernel_stack_watchdog.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
M src/kudu/util/mutex.h
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/rwc_lock.cc
M src/kudu/util/test_graph.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
33 files changed, 154 insertions(+), 195 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/21415/2
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21414 )

Change subject: [common] switch from unique_lock to lock_guard
..


Patch Set 2: Verified+1

unrelated test flakiness: some well-known flakies and a new one (apparently, 
not related to this patch) https://issues.apache.org/jira/browse/KUDU-3573


--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 09 May 2024 20:26:07 +
Gerrit-HasComments: No


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [common] switch from unique_lock to lock_guard
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [master] fix race in auto leader rebalancing

2024-05-09 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21417


Change subject: [master] fix race in auto leader rebalancing
..

[master] fix race in auto leader rebalancing

It turned out that auto leader rebalancing task wasn't explicitly
shutdown upon shutting down catalog manager.  That lead to race
conditions as reported by TSAN, at least in test scenarios (see below).
This patch addresses the issue.

  WARNING: ThreadSanitizer: data race (pid=23827)
Write of size 1 at 0x7b408208 by main thread:
  #0 AnnotateRWLockDestroy 
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_ann.cpp:264
 (auto_rebalancer-test+0x33575e)
  #1 kudu::rw_spinlock::~rw_spinlock() src/kudu/util/locks.h:89:5 
(libmaster.so+0x359376)
  #2 kudu::master::TSManager::~TSManager() 
src/kudu/master/ts_manager.cc:108:1 (libmaster.so+0x4ad201)
  #3 kudu::master::TSManager::~TSManager() 
src/kudu/master/ts_manager.cc:107:25 (libmaster.so+0x4ad229)
  #4 
std::__1::default_delete::operator()(kudu::master::TSManager*)
 const thirdparty/installed/tsan/include/c++/v1/memory:2262:5 
(libmaster.so+0x407ce7)
  #5 std::__1::unique_ptr 
>::reset(kudu::master::TSManager*) 
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x40157d)
  #6 std::__1::unique_ptr >::~unique_ptr() 
thirdparty/installed/tsan/include/c++/v1/memory:2471:19 (libmaster.so+0x4015eb)
  #7 kudu::master::Master::~Master() src/kudu/master/master.cc:263:1 
(libmaster.so+0x3f7a4a)
  #8 kudu::master::Master::~Master() src/kudu/master/master.cc:261:19 
(libmaster.so+0x3f7dc9)
  #9 
std::__1::default_delete::operator()(kudu::master::Master*)
 const thirdparty/installed/tsan/include/c++/v1/memory:2262:5 
(libmaster.so+0x435627)
  #10 std::__1::unique_ptr >::reset(kudu::master::Master*) 
thirdparty/installed/tsan/include/c++/v1/memory:2517:7 (libmaster.so+0x42e6ed)
  #11 kudu::master::MiniMaster::Shutdown() 
src/kudu/master/mini_master.cc:120:13 (libmaster.so+0x4c2612)
...
Previous atomic write of size 4 at 0x7b408208 by thread T439 (mutexes: 
write M1141235379631443968):
  #0 __tsan_atomic32_compare_exchange_strong 
thirdparty/src/llvm-11.0.0.src/projects/compiler-rt/lib/tsan/rtl/tsan_interface_atomic.cpp:780
 (auto_rebalancer-test+0x33eb60)
  #1 base::subtle::Release_CompareAndSwap(int volatile*, int, int) 
/src/kudu/gutil/atomicops-internals-tsan.h:88:3 (libmaster.so+0x2e2b34)
  #2 kudu::rw_semaphore::unlock_shared() src/kudu/util/rw_semaphore.h:91:19 
(libmaster.so+0x2e29c8)
  #3 kudu::rw_spinlock::unlock_shared() src/kudu/util/locks.h:99:10 
(libmaster.so+0x2e28ef)
  #4 std::__1::shared_lock::~shared_lock() 
/thirdparty/installed/tsan/include/c++/v1/shared_mutex:369:19 
(libmaster.so+0x2e23e0)
  #5 
kudu::master::TSManager::GetAllDescriptors(std::__1::vector,
 std::__1::allocator > >*) 
const src/kudu/master/ts_manager.cc:206:1 (libmaster.so+0x4adeb6)
  #6 kudu::master::AutoLeaderRebalancerTask::RunLeaderRebalancer() 
src/kudu/master/auto_leader_rebalancer.cc:405:16 (libmaster.so+0x2fb51b)
  #7 kudu::master::AutoLeaderRebalancerTask::RunLoop() 
src/kudu/master/auto_leader_rebalancer.cc:445:7 (libmaster.so+0x2fbaa9)

This is a follow-up to 10efaf2c77dfe5e4474505e0267c583c011703be.

Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
---
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
2 files changed, 9 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/21417/1
--
To view, visit http://gerrit.cloudera.org:8080/21417
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Iccd66d00280d22b37386230874937e5260f07f3b
Gerrit-Change-Number: 21417
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-08 Thread Alexey Serbin (Code Review)
Hello Attila Bukor, Kudu Jenkins, Abhishek Chennaka,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21414

to look at the new patch set (#2).

Change subject: [common] switch from unique_lock to lock_guard
..

[common] switch from unique_lock to lock_guard

To simplify working with STL synchronization primitives, this patch
changes from std::unique_lock to std::lock_guard where appropriate.
In addition, IWYU and ClangTidy's feedback has been addressed.

Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
---
M src/kudu/clock/hybrid_clock.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/log_cache.cc
M src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/rpc/rpcz_store.cc
M src/kudu/security/tls_context.cc
M src/kudu/server/webserver.cc
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/scanners.cc
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/trace_metrics.h
14 files changed, 36 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/21414/2
--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [common] get rid of MutexLock

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21415


Change subject: [common] get rid of MutexLock
..

[common] get rid of MutexLock

Since contemporary STL library provides both std::lock_guard and
std::unique_lock, there is no need to keep MutexLock.

Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
---
M src/kudu/clock/builtin_ntp.cc
M src/kudu/master/auto_leader_rebalancer.cc
M src/kudu/master/catalog_manager.cc
M src/kudu/rpc/service_pool.cc
M src/kudu/rpc/service_pool.h
M src/kudu/rpc/service_queue.h
M src/kudu/server/diagnostics_log.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tools/table_scanner.cc
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_copy_service.cc
M src/kudu/tserver/tablet_copy_service.h
M src/kudu/util/async_logger.cc
M src/kudu/util/async_logger.h
M src/kudu/util/barrier.h
M src/kudu/util/blocking_queue-test.cc
M src/kudu/util/blocking_queue.h
M src/kudu/util/cloud/instance_detector.cc
M src/kudu/util/countdown_latch.h
M src/kudu/util/debug/trace_event_impl.cc
M src/kudu/util/debug/trace_event_synthetic_delay.cc
M src/kudu/util/kernel_stack_watchdog.cc
M src/kudu/util/mem_tracker.cc
M src/kudu/util/mem_tracker.h
M src/kudu/util/mutex.h
M src/kudu/util/pstack_watcher.cc
M src/kudu/util/rwc_lock.cc
M src/kudu/util/test_graph.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
33 files changed, 155 insertions(+), 195 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/15/21415/1
--
To view, visit http://gerrit.cloudera.org:8080/21415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I49e0ef2c688ef8be74d018bb9bffe70b6655e654
Gerrit-Change-Number: 21415
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [common] switch from unique_lock to lock_guard
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21414 )

Change subject: [common] switch from unique_lock to lock_guard
..


Patch Set 1: Verified+1

unrelated test failure: KUDU-1495


--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 09 May 2024 04:05:05 +
Gerrit-HasComments: No


[kudu-CR] [common] switch from unique lock to lock guard

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21414


Change subject: [common] switch from unique_lock to lock_guard
..

[common] switch from unique_lock to lock_guard

To simplify working with STL synchronization primitives, this patch
changes from std::lock_guard to std::unique_lock where appropriate.
In addition, IWYU and ClangTidy's feedback has been addressed.

This patch doesn't contain any functional modifications.

Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
---
M src/kudu/clock/hybrid_clock.cc
M src/kudu/consensus/consensus_peers.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/log_cache.cc
M src/kudu/integration-tests/tombstoned_voting-stress-test.cc
M src/kudu/rpc/rpcz_store.cc
M src/kudu/security/tls_context.cc
M src/kudu/server/webserver.cc
M src/kudu/tablet/lock_manager-test.cc
M src/kudu/tablet/lock_manager.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tserver/scanners.cc
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/trace_metrics.h
14 files changed, 36 insertions(+), 31 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/14/21414/1
--
To view, visit http://gerrit.cloudera.org:8080/21414
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I72d86ca730113ec0652154d5ce509fc2e479befb
Gerrit-Change-Number: 21414
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] KUDU-613: Cleanup of cache code

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21018 )

Change subject: KUDU-613: Cleanup of cache code
..

KUDU-613: Cleanup of cache code

This patch moves some classes out of the
anonymous namespace and into the headers
of the cache and nvm_cache files. These
classes will be used by the new SLRU cache.
This path also templatizes the HandleTable class
to be used by both the cache and nvm_cache files.

Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Reviewed-on: http://gerrit.cloudera.org:8080/21018
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M src/kudu/util/cache.cc
M src/kudu/util/cache.h
M src/kudu/util/file_cache.cc
M src/kudu/util/nvm_cache.cc
M src/kudu/util/nvm_cache.h
5 files changed, 206 insertions(+), 287 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Gerrit-Change-Number: 21018
Gerrit-PatchSet: 10
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)


[kudu-CR] KUDU-613: Cleanup of cache code

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21018 )

Change subject: KUDU-613: Cleanup of cache code
..


Patch Set 9: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21018
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I506d4577c0ae873b01d7fa4f53846d6fd0f664cf
Gerrit-Change-Number: 21018
Gerrit-PatchSet: 9
Gerrit-Owner: Mahesh Reddy 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 08 May 2024 16:57:50 +
Gerrit-HasComments: No


[kudu-CR] [alter table-test] Increase assertion timeout

2024-05-08 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21347 )

Change subject: [alter_table-test] Increase assertion timeout
..


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG@7
PS1, Line 7: [alter_table-test] Increase assertion timeout
There is a Jira issue about flakiness of this particular scenario: KUDU-3509  
Does it make sense to refer to JIRA in the summary?


http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG@10
PS1, Line 10: So timeout was increased to 60 seconds.
Does it actually help making this test more stable?  If the test in RELEASE 
build failed the same way as mentioned in KUDU-3509, I'm not sure this patch 
could help with the issue.

Could you provide results of running the test before and after the fix via 
dist-test, at least for RELEASE and DEBUG builds?


http://gerrit.cloudera.org:8080/#/c/21347/1//COMMIT_MSG@9
PS1, Line 9: On ubuntu 22.04 release build, AlterReplicationFactorWhileWriting 
test
   : failed with timeout.
Could you attach the log to KUDU-3509?



--
To view, visit http://gerrit.cloudera.org:8080/21347
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifc51365a798569c22e680eb3e7723a5f330d43c5
Gerrit-Change-Number: 21347
Gerrit-PatchSet: 1
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Wed, 08 May 2024 14:16:48 +
Gerrit-HasComments: Yes


[kudu-CR] [metrics] Add metrics for create and delete op time

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21316 )

Change subject: [metrics] Add metrics for create and delete op time
..

[metrics] Add metrics for create and delete op time

Add server-level statistics to track the time consumption of
create tablet and delete tablet operations.
The addition of monitoring items will aid in historical issue
tracking and analysis, as well as facilitate the configuration
of monitoring alarms.

Change-Id: I02bd52013caa94a33143cb16ff3831a49b74bac4
Reviewed-on: http://gerrit.cloudera.org:8080/21316
Tested-by: Kudu Jenkins
Reviewed-by: Alexey Serbin 
---
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/ts_tablet_manager-test.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
4 files changed, 82 insertions(+), 0 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Alexey Serbin: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/21316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I02bd52013caa94a33143cb16ff3831a49b74bac4
Gerrit-Change-Number: 21316
Gerrit-PatchSet: 6
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [metrics] Add metrics for create and delete op time

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21316 )

Change subject: [metrics] Add metrics for create and delete op time
..


Patch Set 5: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bd52013caa94a33143cb16ff3831a49b74bac4
Gerrit-Change-Number: 21316
Gerrit-PatchSet: 5
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 07 May 2024 14:04:00 +
Gerrit-HasComments: No


[kudu-CR] [CMakeLists] Make kudu test main static

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21282 )

Change subject: [CMakeLists] Make kudu_test_main static
..


Patch Set 7: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Gerrit-Change-Number: 21282
Gerrit-PatchSet: 7
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 07 May 2024 07:26:40 +
Gerrit-HasComments: No


[kudu-CR] [CMakeLists] Make kudu test main static

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [CMakeLists] Make kudu_test_main static
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Gerrit-Change-Number: 21282
Gerrit-PatchSet: 7
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [metrics] Add metrics for create and delete op time

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21316 )

Change subject: [metrics] Add metrics for create and delete op time
..


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/21316/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21316/1//COMMIT_MSG@11
PS1, Line 11: These monitoring metrics will be very helpful for analyzing
: issues related to high CPU usage.
> Thank you very much for your response.
Yes: in general that makes sense, of course.  I was just trying to say that 
creating/deleting a tablet replica is mostly disk IO, but not many CPU cycles.

Adding metrics here and there might be a guessing game.  If the goal is to spot 
CPU bottlenecks, I can also recommend using built-in tracing: 
https://kudu.apache.org/docs/troubleshooting.html#kudu_tracing

In addition, running 'htop -p ' and performing stracing, etc. 
could pin-point particular threads that consume a lot of CPU.


http://gerrit.cloudera.org:8080/#/c/21316/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21316/4//COMMIT_MSG@11
PS4, Line 11: analyzing
: issues related to high CPU usage
How the stats on the duration of create/delete a tablet replica could help with 
analyzing high CPU usage scenarios  FWIW, most of the activity while 
creating/deleting a tablet is attributed to disk IO.

If you are interested in tracking CPU usage of some activity (not attributed to 
IO wait times), I could recommend taking a look at the 'reactor_load_percent' 
metric.

I hope this helps.


http://gerrit.cloudera.org:8080/#/c/21316/4/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

http://gerrit.cloudera.org:8080/#/c/21316/4/src/kudu/tserver/ts_tablet_manager.cc@273
PS4, Line 273:
What is the significance of this 'on the current node' part?  All the tablet 
metrics are attributed to the node where the tablet replica is hosted, no?  If 
so, maybe drop this part?


http://gerrit.cloudera.org:8080/#/c/21316/4/src/kudu/tserver/ts_tablet_manager.cc@274
PS4, Line 274:
Why kInfo, not kDebug?  Looking at metrics like 'tablets_opening_time_startup', 
it seems this sort of metric is something that would be used mostly for 
troubleshooting.


http://gerrit.cloudera.org:8080/#/c/21316/4/src/kudu/tserver/ts_tablet_manager.cc@281
PS4, Line 281:
ditto: maybe, kDebug is a better choice here?


http://gerrit.cloudera.org:8080/#/c/21316/4/src/kudu/tserver/ts_tablet_manager.cc@1168
PS4, Line 1168:
Shouldn't the delete_tablet_run_time_ metric be updated before return here as 
well?



--
To view, visit http://gerrit.cloudera.org:8080/21316
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I02bd52013caa94a33143cb16ff3831a49b74bac4
Gerrit-Change-Number: 21316
Gerrit-PatchSet: 1
Gerrit-Owner: KeDeng 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 07 May 2024 07:26:16 +
Gerrit-HasComments: Yes


[kudu-CR] [CMakeLists] Make kudu test main static

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21282 )

Change subject: [CMakeLists] Make kudu_test_main static
..

[CMakeLists] Make kudu_test_main static

This change makes sure that kudu_util.so is loaded at the beginning of
the library search order. This is needed because in unwind_safeness.cc
there is a dlsym execution in the constructor function, that creates a
wrapper for dlopen and dlclose to prevent a potential deadlock during
unwind stack resolve. It looks for the next declaration of the functions
called "dlopen" and "dlclose" in shared object files. If kudu_util is
loaded too late, then it won't find these functions and throws an error.
This happens in ubuntu 22.04 test runs. To solve this issue, the
kudu_test_main was changed to a static library and kudu_util was moved
to the front of the library list.

This is a best effort fix, and it should only have impact on test
execution.

Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Reviewed-on: http://gerrit.cloudera.org:8080/21282
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
---
M src/kudu/util/CMakeLists.txt
1 file changed, 2 insertions(+), 2 deletions(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Gerrit-Change-Number: 21282
Gerrit-PatchSet: 8
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 


[kudu-CR] [CMakeLists] Make kudu test main static

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21282 )

Change subject: [CMakeLists] Make kudu_test_main static
..


Patch Set 7: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Gerrit-Change-Number: 21282
Gerrit-PatchSet: 7
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 07 May 2024 07:28:09 +
Gerrit-HasComments: No


[kudu-CR] [CMakeLists] Make kudu test main static

2024-05-07 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21282 )

Change subject: [CMakeLists] Make kudu_test_main static
..


Patch Set 7:

Alright, if this helps in your environment, probably there is some value in 
this changelist.

I couldn't see this happening in my Ubuntu 22.04 environments, though.  
Probably, there is something quite specific in your case and/or configuration.


--
To view, visit http://gerrit.cloudera.org:8080/21282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0dfeb1fa04ed91e95fd1f8d789f020dd44289fea
Gerrit-Change-Number: 21282
Gerrit-PatchSet: 7
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Reviewer: Ádám Bakai 
Gerrit-Comment-Date: Tue, 07 May 2024 07:28:08 +
Gerrit-HasComments: No


[kudu-CR](gh-pages) [blog] Fix typo in blogpost

2024-05-06 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21378 )

Change subject: [blog] Fix typo in blogpost
..


Patch Set 4: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec743b5151ac9c139d3b3513f99d0369cfe0fc93
Gerrit-Change-Number: 21378
Gerrit-PatchSet: 4
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Mon, 06 May 2024 22:13:29 +
Gerrit-HasComments: No


[kudu-CR] [security-flags-itest] Fix missing command line flags

2024-05-06 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21399 )

Change subject: [security-flags-itest] Fix missing command line flags
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21399/1/src/kudu/integration-tests/security-flags-itest.cc
File src/kudu/integration-tests/security-flags-itest.cc:

http://gerrit.cloudera.org:8080/#/c/21399/1/src/kudu/integration-tests/security-flags-itest.cc@54
PS1, Line 54: ASSERT_EQ(FLAGS_rpc_authentication, "required");
nit for {ASSERT,EXPECT}_EQ: the expected value comes first -- that way it's 
easier to comprehend the error message upon triggering an assertion



--
To view, visit http://gerrit.cloudera.org:8080/21399
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec751e8761562612d97b886740c9b20cd134a0bc
Gerrit-Change-Number: 21399
Gerrit-PatchSet: 1
Gerrit-Owner: Ádám Bakai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Mon, 06 May 2024 16:10:49 +
Gerrit-HasComments: Yes


[kudu-CR] [ARM] Concurrent binary tree memory barriers fixed.

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21127 )

Change subject: [ARM] Concurrent binary tree memory barriers fixed.
..


Patch Set 10:

(1 comment)

> (1 comment)
 >
 > I realized that, the only proper way to add a 2-way load barrier is
 > with
 > std::atomic_thread_fence(std::memory_order_acquire), which is more
 > strict, than
 > the 1-way std::atomic::load(std::memory_order_acquire);
 >
 > I made some performance test. I used TestScanPerformance from
 > cbtree-test and
 > TestMemRowSetUpdatePerformance from memrowset-test. I also added 2
 > more
 > performance tests, one using basic arena, one using
 > ThreadSafeMemoryTrackingArena (we only use the 2nd in production).
 > I will add g++ tests too (as soon as it compiles).
 > See the branches sheet for name resolution.

For some reason, I assumed you were going to add those tests into this patch or 
as a separate patch, but it seems by 'adding

 > https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing

 > (1 comment)
 >
 > I realized that, the only proper way to add a 2-way load barrier is
 > with
 > std::atomic_thread_fence(std::memory_order_acquire), which is more
 > strict, than
 > the 1-way std::atomic::load(std::memory_order_acquire);
 >
 > I made some performance test. I used TestScanPerformance from
 > cbtree-test and
 > TestMemRowSetUpdatePerformance from memrowset-test. I also added 2
 > more
 > performance tests, one using basic arena, one using
 > ThreadSafeMemoryTrackingArena (we only use the 2nd in production).
 > I will add g++ tests too (as soon as it compiles).
 > See the branches sheet for name resolution.

For some reason, I assumed you were going to add your new tests as a part of 
this changelist as well, but it seems you just meant adding results from 
running those tests into the spreadsheet :)

OK, maybe it makes sense to add your new tests into Kudu as well?  Feel free to 
post that as a separate changelist.

Also, it would be great if you could address other feedback items (nits, 
questions of other reviewers, etc.) and post a follow-up revision, if necessary.

Thanks!

 > https://docs.google.com/spreadsheets/d/1whl8h7S0DcjVYp0jksR87P9v4U8TQH9nS1C9N0D-lUo/edit?usp=sharing

http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21127/6//COMMIT_MSG@33
PS6, Line 33: The following 6 barriers need to be more strict:
> See the google sheet.
Thanks for posting the numbers.  A few questions:
  * What were the machines that x86_64 and ARM tests were run at?  How close 
they are in terms of processing power (frequency, number of CPU cores)?
  * Do I understand correctly that both the x86_64 and ARM machines were the 
same for all the tests, just different binaries were run?
  * I'm not sure I understand what 'x86_master_clang_2' at the Summary tab is 
for?  How is it different from 'x86_master_clang'?
  * The numbers posted at the Summary tab: are those seconds of test runtime or 
that's some completely different (e.g., number scanned/inserted keys), or for 
different tests they are of different units?
  * The numbers posted at the Summary tab for the 
'TestMemRowSetUpdatePerformance' test: what are those?  Are those just total 
runtime or some particular timing report part (inserting, counting, updating) 
of the test?

Also, could you run existing corresponding Kudu tests (you can run you new 
tests as well, of course) prior and post your gerrit patch under the 'perf' 
utility (say, 10 runs) and report the results.  No need to put them into a 
spreadsheet, just present the output from the 'perf stat' utility.  You can 
browse 'git log' and search for the examples how running those tests is usually 
done.  You might also be interested in perf-mem and may be in perf-lock, but 
reports just from generic 'perf' would be great already.

Thanks a lot!

https://man7.org/linux/man-pages/man1/perf.1.html



--
To view, visit http://gerrit.cloudera.org:8080/21127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
Gerrit-Change-Number: 21127
Gerrit-PatchSet: 10
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Sat, 04 May 2024 03:06:59 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) [blog] Fix typo in blogpost

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21378 )

Change subject: [blog] Fix typo in blogpost
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21378
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: gh-pages
Gerrit-MessageType: comment
Gerrit-Change-Id: Iec743b5151ac9c139d3b3513f99d0369cfe0fc93
Gerrit-Change-Number: 21378
Gerrit-PatchSet: 3
Gerrit-Owner: Abhishek Chennaka 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy 
Gerrit-Comment-Date: Sat, 04 May 2024 01:43:10 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3216 fix flakiness in LeadershipChangeOnTskGeneration

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21397


Change subject: KUDU-3216 fix flakiness in LeadershipChangeOnTskGeneration
..

KUDU-3216 fix flakiness in LeadershipChangeOnTskGeneration

Before this patch, the LeadershipChangeOnTskGeneration scenario from
CatalogManagerTskITest would fail once in about 50 runs, DEBUG build.

With this patch, the test scenario hasn't had a single failure in
about 300 runs, DEBUG build.

Change-Id: I139e46627206fc13490ca405bb62dc29934dc4be
---
M src/kudu/integration-tests/catalog_manager_tsk-itest.cc
1 file changed, 22 insertions(+), 5 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/97/21397/1
--
To view, visit http://gerrit.cloudera.org:8080/21397
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I139e46627206fc13490ca405bb62dc29934dc4be
Gerrit-Change-Number: 21397
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21396 )

Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11
..

[g++11] Fix DecrementIntCell for g++10 and g++11

There seems to be a compiler bug, that optimizes out the safety check
for INT_MIN in the DecrementIntCell function. It appears on RHEL 9.2
with g++ 11.4.1. Only in Release build. For more infoi, see:

https://stackoverflow.com/questions/78424303/g-optimizes-away-check-for-int-min-in-release-build

The issue seems to be fixed in g++12 and not yet present in g++9.

Solution:
Slightly change the function to ensure it is compiled correctly.
This modification should not alter the correct optimized code.

Basically, any change where the compiler cannot perform the two
optimization steps (in this order) should address the issue:

+ if (x == INT_MIN) x = INT_MAX; else x -= 1; > x -= 1
(this is equivalent on the x86 platform).
+ if (x - 1 < x) > if (true)
(this equivalence holds only at the mathematical level).

Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5
Reviewed-on: http://gerrit.cloudera.org:8080/21396
Tested-by: Alexey Serbin 
Reviewed-by: Alexey Serbin 
---
M src/kudu/common/key_util.cc
1 file changed, 1 insertion(+), 1 deletion(-)

Approvals:
  Alexey Serbin: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5
Gerrit-Change-Number: 21396
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21396 )

Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11
..


Patch Set 3: Code-Review+2


--
To view, visit http://gerrit.cloudera.org:8080/21396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5
Gerrit-Change-Number: 21396
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 03 May 2024 18:02:57 +
Gerrit-HasComments: No


[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5
Gerrit-Change-Number: 21396
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)


[kudu-CR] [g++11] Fix DecrementIntCell for g++10 and g++11

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21396 )

Change subject: [g++11] Fix DecrementIntCell for g++10 and g++11
..


Patch Set 3: Verified+1

unrelated test failures:
  * KUDU-3216


--
To view, visit http://gerrit.cloudera.org:8080/21396
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia3cea2849a88c4d7e2587ceb805cd3258652e3c5
Gerrit-Change-Number: 21396
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 03 May 2024 18:01:27 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3568 Fix budgeting constraint test by enabling preset factor

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21360 )

Change subject: KUDU-3568 Fix budgeting constraint test by enabling preset 
factor
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21360/2//COMMIT_MSG
Commit Message:

PS2:
Thank you for the write up -- I must admit I haven't read it yet to digest 
everything, but I'm planning to do so when I can allocate more time for this.

The only feedback at this point from my side is that the test is still failing 
even with this fix:

src/kudu/tablet/compaction-test.cc:910: Failure
Value of: JoinStrings(sink.logged_msgs(), "\n")  
Expected: has substring "removed from compaction input due to memory 
constraints" 
  Actual: "I20240503 09:52:40.236797 953582 compaction-test.cc:904] 
CompactRowSetsOp complete. Timing: real 0.872s\tuser 0.844s\tsys 0.028s 
Metrics: 
{\"bytes_written\":4817,\"cfile_cache_hit\":90,\"cfile_cache_hit_bytes\":4310,\"cfile_cache_miss\":330,\"cfile_cache_miss_bytes\":3794180,\"cfile_init\":41,\"delta_iterators_relevant\":40,\"dirs.queue_time_us\":1303,\"dirs.run_cpu_time_us\":824,\"dirs.run_wall_time_us\":844,\"drs_written\":1,\"lbm_read_time_us\":4281,\"lbm_reads_lt_1ms\":494,\"lbm_write_time_us\":4794,\"lbm_writes_lt_1ms\":132,\"mutex_wait_us\":223,\"num_input_rowsets\":10,\"peak_mem_usage\":2147727,\"rows_written\":20,\"spinlock_wait_cycles\":83584,\"thread_start_us\":418,\"threads_started\":7}"
I20240503 09:52:40.240758 953582 test_util.cc:175] 
--- 
  
I20240503 09:52:40.240803 953582 test_util.cc:176] Had fatal failures, leaving 
test files at 
/tmp/kudutest-0/compaction-test.TestCompaction.TestRowSetCompactionSkipWithBudgetingConstraints.1714755106413846-953582-0



--
To view, visit http://gerrit.cloudera.org:8080/21360
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9df218cd2d8ef3709793db267d5a0d651421dbb6
Gerrit-Change-Number: 21360
Gerrit-PatchSet: 2
Gerrit-Owner: Ashwani Raina 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Fri, 03 May 2024 16:57:19 +
Gerrit-HasComments: Yes


[kudu-CR] [tserver] limit number of trace metrics for TabletCopyClient

2024-05-03 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21393 )

Change subject: [tserver] limit number of trace metrics for TabletCopyClient
..

[tserver] limit number of trace metrics for TabletCopyClient

The trace metrics registry assumes that the number of entries is quite
small, with the current threshold of 100.  In its turn, a thread pool
unconditionally registers and updates its trace metrics while executing
submitted tasks.  Since a tablet server might host thousands of tablet
replicas, it's not a good idea to include the UUID of the tablet into
the name of the download thread pool spawned by every TabletCopyClient
instance.

This is a follow-up to 0d95304fa46ee5d96bcaa934c7660369f2860e06.

Change-Id: I334aa81aaed2378e7cae558bd8bb9e0f0c970fec
Reviewed-on: http://gerrit.cloudera.org:8080/21393
Tested-by: Marton Greber 
Reviewed-by: Marton Greber 
---
M src/kudu/tserver/tablet_copy_client.cc
1 file changed, 10 insertions(+), 2 deletions(-)

Approvals:
  Marton Greber: Looks good to me, approved; Verified

--
To view, visit http://gerrit.cloudera.org:8080/21393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I334aa81aaed2378e7cae558bd8bb9e0f0c970fec
Gerrit-Change-Number: 21393
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 


[kudu-CR] [tserver] limit number of trace metrics for TabletCopyClient

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/21393


Change subject: [tserver] limit number of trace metrics for TabletCopyClient
..

[tserver] limit number of trace metrics for TabletCopyClient

The trace metrics registry assumes that the number of entries is quite
small, with the current threshold of 100.  In its turn, a thread pool
unconditionally registers and updates its trace metrics while executing
submitted tasks.  Since a tablet server might host thousands of tablet
replicas, it's not a good idea to include the UUID of the tablet into
the name of the download thread pool spawned by every TabletCopyClient
instance.

This is a follow-up to 0d95304fa46ee5d96bcaa934c7660369f2860e06.

Change-Id: I334aa81aaed2378e7cae558bd8bb9e0f0c970fec
---
M src/kudu/tserver/tablet_copy_client.cc
1 file changed, 10 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/93/21393/1
--
To view, visit http://gerrit.cloudera.org:8080/21393
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I334aa81aaed2378e7cae558bd8bb9e0f0c970fec
Gerrit-Change-Number: 21393
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 


[kudu-CR](branch-1.17.x) KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21366 )

Change subject: KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()
..

KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()

This patch addresses data race reported in KUDU-3569, using
the tablet schema defined by the iterator's projection instead of
the schema stored in the tablet metadata file.  The latter might
be swapped by concurrently running AlterTable, which is the root
cause of the data race.

This is a follow-up to 936d7edc4e4b69d2e1f1dffc96760cb3fd57a934.

Change-Id: I92daa74cb86a77a4350f42db9ca5dec3a0d4ff75
Reviewed-on: http://gerrit.cloudera.org:8080/21359
Tested-by: Alexey Serbin 
Reviewed-by: Abhishek Chennaka 
(cherry picked from commit 977f1911fbcc4d5c323d6ae7ce7c1ab100ed11ea)
  Conflicts:
src/kudu/tablet/cfile_set.h
Reviewed-on: http://gerrit.cloudera.org:8080/21366
---
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
2 files changed, 16 insertions(+), 6 deletions(-)

Approvals:
  Abhishek Chennaka: Looks good to me, approved
  Alexey Serbin: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I92daa74cb86a77a4350f42db9ca5dec3a0d4ff75
Gerrit-Change-Number: 21366
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR](branch-1.17.x) KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I92daa74cb86a77a4350f42db9ca5dec3a0d4ff75
Gerrit-Change-Number: 21366
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR](branch-1.17.x) KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21366 )

Change subject: KUDU-3569 fix race in CFileSet::Iterator::OptimizePKPredicates()
..


Patch Set 1: Verified+1

unrelated test failures


--
To view, visit http://gerrit.cloudera.org:8080/21366
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I92daa74cb86a77a4350f42db9ca5dec3a0d4ff75
Gerrit-Change-Number: 21366
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Thu, 02 May 2024 20:59:33 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) KUDU-3570 fix use-after-free in MajorDeltaCompactionOp

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21365 )

Change subject: KUDU-3570 fix use-after-free in MajorDeltaCompactionOp
..


Patch Set 1: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/21365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: comment
Gerrit-Change-Id: I491c6d98bed8780bcfb62f152db471d7a260d305
Gerrit-Change-Number: 21365
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>
Gerrit-Comment-Date: Thu, 02 May 2024 20:59:01 +
Gerrit-HasComments: No


[kudu-CR](branch-1.17.x) KUDU-3570 fix use-after-free in MajorDeltaCompactionOp

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-3570 fix use-after-free in MajorDeltaCompactionOp
..


Removed Verified-1 by Kudu Jenkins (120)
--
To view, visit http://gerrit.cloudera.org:8080/21365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I491c6d98bed8780bcfb62f152db471d7a260d305
Gerrit-Change-Number: 21365
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR](branch-1.17.x) KUDU-3570 fix use-after-free in MajorDeltaCompactionOp

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/21365 )

Change subject: KUDU-3570 fix use-after-free in MajorDeltaCompactionOp
..

KUDU-3570 fix use-after-free in MajorDeltaCompactionOp

This patch addresses heap-use-after-free and data race issues reported
in KUDU-3570.  With this and one prior patch, neither TSAN nor ASAN
reports any warnings when running alter_table-randomized-test, at least
that's the stats collected from more than 100 iterations.

Change-Id: I491c6d98bed8780bcfb62f152db471d7a260d305
Reviewed-on: http://gerrit.cloudera.org:8080/21362
Tested-by: Alexey Serbin 
Reviewed-by: Abhishek Chennaka 
(cherry picked from commit 3912a97cd8998ef04c4e6f9c38bd365c582e8171)
Reviewed-on: http://gerrit.cloudera.org:8080/21365
Reviewed-by: Wang Xixu <1450306...@qq.com>
---
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
2 files changed, 15 insertions(+), 10 deletions(-)

Approvals:
  Wang Xixu: Looks good to me, but someone else must approve
  Abhishek Chennaka: Looks good to me, approved
  Alexey Serbin: Verified

--
To view, visit http://gerrit.cloudera.org:8080/21365
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: branch-1.17.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I491c6d98bed8780bcfb62f152db471d7a260d305
Gerrit-Change-Number: 21365
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin 
Gerrit-Reviewer: Abhishek Chennaka 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Attila Bukor 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Wang Xixu <1450306...@qq.com>


[kudu-CR] KUDU-3371 Fix the crash bug when run binaries on older CPU machines

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21287 )

Change subject: KUDU-3371 Fix the crash bug when run binaries on older CPU 
machines
..


Patch Set 1:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@7
PS1, Line 7: KUDU-3371
> Use a dedicated jira ID that addresses this specific crash. KUDU-3371 is a
+1


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@11
PS1, Line 11: the
nit: drop


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@12
PS1, Line 12: the
nit: a


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@12
PS1, Line 12: (e.g. AVX512)
Are you sure that was the only reason behind those crashes?

As already pointed at other reviewers, it might help to summarize your findings 
in an upstream JIRA item.  Let me know if you want me to help with providing 
information on crashes I see in my environment.

Thank you!


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@15
PS1, Line 15: WITH_SNAPPY
> nit: Should this be 'PORTABLE'?
+1


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@15
PS1, Line 15: build
nit: building


http://gerrit.cloudera.org:8080/#/c/21287/1//COMMIT_MSG@15
PS1, Line 15: enable
nit: enables



--
To view, visit http://gerrit.cloudera.org:8080/21287
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id30ae995c41a592fccbdb822bc1f457c5e6878ac
Gerrit-Change-Number: 21287
Gerrit-PatchSet: 1
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: KeDeng 
Gerrit-Reviewer: Yifan Zhang 
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Thu, 02 May 2024 19:42:37 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3371 check for RocksDB dir presence upon opening FSManager

2024-05-02 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21295 )

Change subject: KUDU-3371 check for RocksDB dir presence upon opening FSManager
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21295/1/src/kudu/fs/data_dirs.cc
File src/kudu/fs/data_dirs.cc:

http://gerrit.cloudera.org:8080/#/c/21295/1/src/kudu/fs/data_dirs.cc@212
PS1, Line 212: // Initialize the directory only if it's healthy.
 : // Note: the unhealthy directories wi
> Because the rocksdb instance will be Open() in RdbDir::Prepare(), this proc
I'm not sure I understand: what exactly makes them not suitable to have in the 
constructor?  There isn't any error handling in this code except for logging a 
WARNING message.

Could you please clarify?



--
To view, visit http://gerrit.cloudera.org:8080/21295
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iab4bffc6b902ab96edf0ca6c44f51c8db2670d52
Gerrit-Change-Number: 21295
Gerrit-PatchSet: 3
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Thu, 02 May 2024 19:05:27 +
Gerrit-HasComments: Yes


  1   2   3   4   5   6   7   8   9   10   >