[
https://issues.apache.org/jira/browse/HDFS-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732415#comment-17732415
]
ASF GitHub Bot commented on HDFS-17049:
---------------------------------------
Hexiaoqiao commented on code in PR #5743:
URL: https://github.com/apache/hadoop/pull/5743#discussion_r1229152046
##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockGroupId.java:
##########
@@ -172,6 +176,41 @@ public void testTriggerBlockGroupIdCollision() throws
IOException {
}
}
+ /**
+ * Test that the values generated by blockGroup ID generator are unique,
+ * even if they are generated concurrently.
+ * @throws Exception
+ */
+ @Test
+ public void testBlockGroupIdThreadSafety() throws Exception {
+ List<List<Long>> blockIds = new ArrayList<>();
+ List<Thread> threads = new ArrayList<>();
+ for (int i = 0; i < 20; i++) {
+ blockIds.add(new ArrayList<>());
+ threads.add(new Thread(() -> {
+ for (int j = 0; j < 1000; j++) {
+ long next = blockGrpIdGenerator.nextValue();
+ blockIds.get(j).add(next);
+ }
+ }));
+ }
+ for (Thread t : threads) {
+ t.start();
+ }
+ for (Thread t : threads) {
+ t.join();
+ }
+ Set<Long> allBlockIds = new HashSet<>();
Review Comment:
Here will be more readable?
```
Set<Long> allBlockIds = new HashSet<>();
for (List<Long> set : blockIds) {
for (long id : set) {
if (!allBlockIds.add(id)) {
fail("Same block group id is generated!");
}
}
}
```
Any more simple check way? Such as using `HashMap` directly?
##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockGroupIdGenerator.java:
##########
@@ -51,7 +51,7 @@ public class SequentialBlockGroupIdGenerator extends
SequentialNumber {
}
@Override // NumberGenerator
- public long nextValue() {
+ synchronized public long nextValue() {
Review Comment:
`synchronized public long nextValue()` -> `public synchronized long
nextValue()`
> EC: Fix duplicate block group IDs generated by SequentialBlockGroupIdGenerator
> ------------------------------------------------------------------------------
>
> Key: HDFS-17049
> URL: https://issues.apache.org/jira/browse/HDFS-17049
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Shuyan Zhang
> Assignee: Shuyan Zhang
> Priority: Major
> Labels: pull-request-available
>
> When I used multiple clients to write EC files concurrently, I found that
> NameNode generated the same block group ID for different files:
> ```
> 2023-06-13 20:09:59,514 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> allocate blk_-9223372036854697568_14389 for /ec-test/10/4068034329705654124
> 2023-06-13 20:09:59,514 INFO org.apache.hadoop.hdfs.StateChange: BLOCK*
> allocate blk_-9223372036854697568_14390 for /ec-test/19/7042966144171770731
> ```
> After diving into `SequentialBlockGroupIdGenerator`, I found that the current
> implementation of `nextValue` is not thread-safe.
> This problem must be fixed.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]