[ 
https://issues.apache.org/jira/browse/HDFS-17049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17732415#comment-17732415
 ] 

ASF GitHub Bot commented on HDFS-17049:
---------------------------------------

Hexiaoqiao commented on code in PR #5743:
URL: https://github.com/apache/hadoop/pull/5743#discussion_r1229152046


##########
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestSequentialBlockGroupId.java:
##########
@@ -172,6 +176,41 @@ public void testTriggerBlockGroupIdCollision() throws 
IOException {
     }
   }
 
+  /**
+   * Test that the values generated by blockGroup ID generator are unique,
+   * even if they are generated concurrently.
+   * @throws Exception
+   */
+  @Test
+  public void testBlockGroupIdThreadSafety() throws Exception {
+    List<List<Long>> blockIds = new ArrayList<>();
+    List<Thread> threads = new ArrayList<>();
+    for (int i = 0; i < 20; i++) {
+      blockIds.add(new ArrayList<>());
+      threads.add(new Thread(() -> {
+        for (int j = 0; j < 1000; j++) {
+          long next = blockGrpIdGenerator.nextValue();
+          blockIds.get(j).add(next);
+        }
+      }));
+    }
+    for (Thread t : threads) {
+      t.start();
+    }
+    for (Thread t : threads) {
+      t.join();
+    }
+    Set<Long> allBlockIds = new HashSet<>();

Review Comment:
   Here will be more readable?
   ```
       Set<Long> allBlockIds = new HashSet<>();
       for (List<Long> set : blockIds) {
         for (long id : set) {
           if (!allBlockIds.add(id)) {
             fail("Same block group id is generated!");
           }
         }
       }
   ```
   Any more simple check way? Such as using `HashMap` directly?



##########
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/SequentialBlockGroupIdGenerator.java:
##########
@@ -51,7 +51,7 @@ public class SequentialBlockGroupIdGenerator extends 
SequentialNumber {
   }
 
   @Override // NumberGenerator
-  public long nextValue() {
+  synchronized public long nextValue() {

Review Comment:
   `synchronized public long nextValue()` -> `public synchronized long 
nextValue()`





> EC: Fix duplicate block group IDs generated by SequentialBlockGroupIdGenerator
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-17049
>                 URL: https://issues.apache.org/jira/browse/HDFS-17049
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Shuyan Zhang
>            Assignee: Shuyan Zhang
>            Priority: Major
>              Labels: pull-request-available
>
> When I used multiple clients to write EC files concurrently, I found that 
> NameNode generated the same block group ID for different files:
> ```
> 2023-06-13 20:09:59,514 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocate blk_-9223372036854697568_14389 for /ec-test/10/4068034329705654124
> 2023-06-13 20:09:59,514 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* 
> allocate blk_-9223372036854697568_14390 for /ec-test/19/7042966144171770731
> ```
> After diving into `SequentialBlockGroupIdGenerator`, I found that the current 
> implementation of `nextValue` is not thread-safe.
> This problem must be fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to