[kudu-CR] KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)

2024-03-19 Thread Yingchun Lai (Code Review)
Yingchun Lai has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21075 )

Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)
..


Patch Set 6:

> Patch Set 6:
>
> Just a clarifying question, so the only thing changed compared to the 1st 
> revision of this patch is that lz4 is switched out for snappy? -> only 
> changes in thirdparty/build-definitions.sh ?
>
> Thanks for debugging the issue!

You can review the changes by https://gerrit.cloudera.org/c/21075/1..6, the 
changed files are:
Commit message
cmake_modules/FindSnappy.cmake
src/kudu/fs/CMakeLists.txt
src/kudu/server/CMakeLists.txt
src/kudu/util/CMakeLists.txt
thirdparty/build-definitions.sh


--
To view, visit http://gerrit.cloudera.org:8080/21075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23b7d2a16802af01a382a1d74cd9869baf364688
Gerrit-Change-Number: 21075
Gerrit-PatchSet: 6
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Wed, 20 Mar 2024 02:50:00 +
Gerrit-HasComments: No


[kudu-CR] KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)

2024-03-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21075 )

Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21075/6/thirdparty/build-definitions.sh
File thirdparty/build-definitions.sh:

http://gerrit.cloudera.org:8080/#/c/21075/6/thirdparty/build-definitions.sh@1213
PS6, Line 1213: DWITH_LZ4=OFF
Does this mean we don't need LZ4 support for our usage pattern of RocksDB, or 
it's just some issue to build rocksdb to pickup and use both LZ4 and snappy 
libraries from Kudu's 3rd party?



--
To view, visit http://gerrit.cloudera.org:8080/21075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23b7d2a16802af01a382a1d74cd9869baf364688
Gerrit-Change-Number: 21075
Gerrit-PatchSet: 6
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Wed, 20 Mar 2024 01:28:05 +
Gerrit-HasComments: Yes


[kudu-CR] [wip][build] KUDU-3551 Upgrade gradle to 7.6.4

2024-03-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21030 )

Change subject: [wip][build] KUDU-3551 Upgrade gradle to 7.6.4
..


Patch Set 5:

(5 comments)

Just a few quite random comments.  I haven't taken a proper look at this patch 
yet.

http://gerrit.cloudera.org:8080/#/c/21030/5//COMMIT_MSG
Commit Message:

PS5:
It would be great to include the references to the Gradle documentation that 
you used to find information on deprecated configurations, mappings between the 
old/deprecated ones and the new ones, etc.


http://gerrit.cloudera.org:8080/#/c/21030/5/java/build.gradle
File java/build.gradle:

http://gerrit.cloudera.org:8080/#/c/21030/5/java/build.gradle@a61
PS5, Line 61:
Shouldn't this become 'implementation' as per docs at 
https://docs.gradle.org/current/userguide/upgrading_version_6.html ?

Or the original designation of this to be a 'compile' configuration was not 
correct?


http://gerrit.cloudera.org:8080/#/c/21030/5/java/kudu-subprocess/build.gradle
File java/kudu-subprocess/build.gradle:

http://gerrit.cloudera.org:8080/#/c/21030/5/java/kudu-subprocess/build.gradle@64
PS5, Line 64: implementation libs.yetusAnnotations
Could this be 'compileOnly' as well?


http://gerrit.cloudera.org:8080/#/c/21030/5/java/kudu-test-utils/build.gradle
File java/kudu-test-utils/build.gradle:

http://gerrit.cloudera.org:8080/#/c/21030/5/java/kudu-test-utils/build.gradle@24
PS5, Line 24:
style nit: this isn't consistent with the rest of the code in this file -- 
there isn't a space between a parenthesis elsewhere, even in the newly added 
shadowJar block at lines 60-67


http://gerrit.cloudera.org:8080/#/c/21030/5/java/kudu-test-utils/build.gradle@54
PS5, Line 54:   implementation libs.async
What is this for and why it appeared only while upgrading to Gradle7?  Would be 
great to add a comment.



--
To view, visit http://gerrit.cloudera.org:8080/21030
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I915dab011aba633d55a79c72ea6f459d703d7f47
Gerrit-Change-Number: 21030
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Chovan 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 20 Mar 2024 01:10:58 +
Gerrit-HasComments: Yes


[kudu-CR](gh-pages) [blog] blogpost about auto-incrementing column in Kudu

2024-03-19 Thread Alexey Serbin (Code Review)
Alexey Serbin has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21119 )

Change subject: [blog] blogpost about auto-incrementing column in Kudu
..


Patch Set 2:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md
File _posts/2024-03-07-introducing-auto-incrementing-column.md:

PS2:
Please remove trailing spaces in every line of this file.


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@20
PS2, Line 20: declared non-unique
partially defined


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@21
PS2, Line 21: This is
The idea behind the auto-incrementing column is to have a monotonically 
increasing counter.


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@21
PS2, Line 21: The
: system populates the counter upon every INSERT operation
It would be great to mention this is done at the server side, not at the client 
side ("system" sounds a bit vague, at least to me).


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@22
PS2, Line 22: partition local
partition-local


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@29
PS2, Line 29: to the next highest available counter value
Do you mind deciphering 'next highest available'?  What is the availability 
criteria in this context?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@35
PS2, Line 35: auto_incrementing_counter
This is a bit confusing to me: is that "auto_incrementing_id" or 
"auto_incrementing_counter"?  Or it's named differently in various places?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@46
PS2, Line 46: If the column is really needed
If the auto-incrementing column's data is needed, ...


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@79
PS2, Line 79: primary key
partial primary key ?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@87
PS2, Line 87: for column id
nit: in the 'id' column


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@104
PS2, Line 104: fetch
to fetch


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@106
PS2, Line 106: SELECT *, auto_incrementing_id
I was under impression that Impala would introduce a special session-level 
config option to do so, no?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@107
PS2, Line 107: SELECT *, auto_incrementing_id
Just to make sure it understand this correctly: will the queries below work as 
expected as well

SELECT auto_incrementing_id FROM demo_table;

SELECT auto_incrementing_id, * FROM demo_table;


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@120
PS2, Line 120: Update and Delete a row
nit: Updating and deleting rows


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@125
PS2, Line 125: default> DELETE FROM demo_table where id=2;
 : Query: DELETE FROM demo_table where id=2;
 : Modified 1 row(s), 0 row error(s) in 1.40s
Do you mind updating this example (or adding an extra one with UPDATE or DELETE 
statement) using id = 1 predicate to see a report on updating multiple rows at 
once?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@150
PS2, Line 150: Unlike in Impala scanning the table all the table data including 
the auto incrementing column is
Unlike in Impala, scanning ...


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@150
PS2, Line 150: the table all the table data including the auto incrementing 
column is
 : returned when the entire table is scanned and there is no need 
to explicitly request the
 : auto-incrementing column.
Consider rewriting this part to be more readable.


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@155
PS2, Line 155: the auto_incrementing column value
a value for the auto-incrementing column  ?


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@158
PS2, Line 158:  Examples
Break this into two separate sentences.


http://gerrit.cloudera.org:8080/#/c/21119/2/_posts/2024-03-07-introducing-auto-incrementing-column.md@166
PS2, Line 166: existing
Why adding 'existing'?  Is this 

[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.

2024-03-19 Thread Zoltan Martonka (Code Review)
Hello Zoltan Chovan, Ashwani Raina, Kudu Jenkins, Anonymous Coward (763),

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/21127

to look at the new patch set (#4).

Change subject: WIP [ARM] Concurrent binary tree memory fences fixed.
..

WIP [ARM] Concurrent binary tree memory fences fixed.

TestCBTree.TestRacyConcurrentInsert sometimes fails on ARM.
The concurrent inserts produce a tree with some incorrectly ordered nodes. Apart
from incorrect order, there are no other errors in the tree. All inserted
elements are present, but CBTree::GetCopy cannot find them due to the incorrect
ordering.

This is not a unit test error, but a real error in the CBTree implementation.
Modifying the test to only do the inserts first, and only start checking when
the tree is finalized does not prevent the bug. So calling only the tree->Insert
function from multiple threads is sufficient to reproduce the issue (only on
ARM).

WIP cause: This change fixes the problem, and can be the final solution.
However, I would also consider changing the version field to std::atomic and
keep using the same barriers as now (not the default memory_order_seq_cst).
Either way, we need to agree on the proposed changes of the memory barriers
then I can also make a version with using std::atomic.

Root cause:

Some memory order restrictions are not strict enough. It does not cause a
problem on x86 (There were no real changes for 11 years), but it causes a
problems on ARM. x86 guarantees a globally consistent order for stores (TSO).
ARM, in contrast, allows stores to different memory locations to be observed
differently from the program order.
More info:
https://www.sciencedirect.com/science/article/pii/S1383762124000390

The following 6 barriers need to be more strict:

1. When we set the Splitting / Inserting flag on a node, then it is not enough
  to flush all previous changes. It is also very important not to reorder any
  write before it. So instead of Release_Store (which is practically equivalent
  to std::memory_order_release), we need a 2 way barrier.

2. When we search in SeekToLeaf/GetCopy/ContainsKey/TraverseToLeaf, and we
  check if the version is unchanged, then a load_acquire (equivalent to
  std::memory_order_acquire) is not sufficient. We have to also make sure no
  read is reordered in the other direction.

Currently adding only the 2 barriers under (1.) is sufficient to stabilize the
test. However the constraints imposed by load_acquire under (2.) are also
insufficient in theory.

Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
---
M src/kudu/tablet/concurrent_btree.h
1 file changed, 9 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/27/21127/4
--
To view, visit http://gerrit.cloudera.org:8080/21127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
Gerrit-Change-Number: 21127
Gerrit-PatchSet: 4
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Zoltan Martonka 


[kudu-CR] WIP [ARM] Concurrent binary tree memory fences fixed.

2024-03-19 Thread Anonymous Coward (Code Review)
Anonymous Coward (763) has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21127 )

Change subject: WIP [ARM] Concurrent binary tree memory fences fixed.
..


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21127/3//COMMIT_MSG@43
PS3, Line 43: the fences
> base::subtle::Release_Store(v, *v | BTREE_INSERTING_MASK);
Sorry I wrote it wrong. base::subtle::Release_Store(v, *v | 
BTREE_INSERTING_MASK); is equivalent to:
std::atomic_thread_fence(std::memory_order_release)
Lazy_Store(v, *v | BTREE_INSERTING_MASK);



--
To view, visit http://gerrit.cloudera.org:8080/21127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie7c02dc3b9444599ef680fa9b29ae7a4d39dd382
Gerrit-Change-Number: 21127
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Martonka 
Gerrit-Reviewer: Anonymous Coward (763)
Gerrit-Reviewer: Ashwani Raina 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Chovan 
Gerrit-Reviewer: Zoltan Martonka 
Gerrit-Comment-Date: Tue, 19 Mar 2024 16:39:10 +
Gerrit-HasComments: Yes


[kudu-CR] KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)

2024-03-19 Thread Marton Greber (Code Review)
Marton Greber has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21075 )

Change subject: KUDU-3371 [fs] Use RocksDB to store LBM metadata (2nd try)
..


Patch Set 6:

Just a clarifying question, so the only thing changed compared to the 1st 
revision of this patch is that lz4 is switched out for snappy? -> only changes 
in thirdparty/build-definitions.sh ?

Thanks for debugging the issue!


--
To view, visit http://gerrit.cloudera.org:8080/21075
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I23b7d2a16802af01a382a1d74cd9869baf364688
Gerrit-Change-Number: 21075
Gerrit-PatchSet: 6
Gerrit-Owner: Yingchun Lai 
Gerrit-Reviewer: Alexey Serbin 
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Marton Greber 
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Reviewer: Yingchun Lai 
Gerrit-Comment-Date: Tue, 19 Mar 2024 13:06:30 +
Gerrit-HasComments: No