This is an automated email from the ASF dual-hosted git repository.

ethanfeng pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/celeborn.git


The following commit(s) were added to refs/heads/main by this push:
     new 4b150be0c [CELEBORN-1669] Fix NullPointerException for 
PartitionFilesSorter#updateSortedShuffleFiles after cleaning up expired shuffle 
key
4b150be0c is described below

commit 4b150be0c89bd47ac2c79b65638142999a5cd949
Author: SteNicholas <[email protected]>
AuthorDate: Thu Oct 24 17:47:25 2024 +0800

    [CELEBORN-1669] Fix NullPointerException for 
PartitionFilesSorter#updateSortedShuffleFiles after cleaning up expired shuffle 
key
    
    ### What changes were proposed in this pull request?
    
    Fix `NullPointerException` for 
`PartitionFilesSorter#updateSortedShuffleFiles` after cleaning up expired 
shuffle key.
    
    ### Why are the changes needed?
    
    `PartitionFilesSorter` sorts shuffle files in `worker-file-sorter-executor` 
thread and cleans up expired key in `worker-expired-shuffle-cleaner` thread. 
There is a case that after `worker-expired-shuffle-cleaner` cleaning up expired 
shuffle key, `worker-file-sorter-executor` updates sorted shuffle files, which 
causes `NullPointerException` at present.
    
    ```
    2024-10-23 17:26:17,162 [INFO] [worker-expired-shuffle-cleaner] - 
org.apache.celeborn.service.deploy.worker.Worker -Logging.scala(51) -Cleaned up 
expired shuffle application_1724141892576_3843182_1-0
    2024-10-23 17:26:17,392 [ERROR] [worker-file-sorter-executor-237572] - 
org.apache.celeborn.service.deploy.worker.storage.PartitionFilesSorter 
-PartitionFilesSorter.java(752) -Sorting shuffle file for 
application_1724141892576_3843182_1-0-1875-0-0 
/mnt/storage02/celeborn-worker/shuffle_data/application_1724141892576_3843182_1/0/1875-0-0
 failed, detail:
    java.lang.NullPointerException: null
        at 
org.apache.celeborn.service.deploy.worker.storage.PartitionFilesSorter.updateSortedShuffleFiles(PartitionFilesSorter.java:455)
 ~[celeborn-worker_2.12-0.5.0-SNAPSHOT.jar:0.5.0-SNAPSHOT]
        at 
org.apache.celeborn.service.deploy.worker.storage.PartitionFilesSorter$FileSorter.sort(PartitionFilesSorter.java:747)
 ~[celeborn-worker_2.12-0.5.0-SNAPSHOT.jar:0.5.0-SNAPSHOT]
        at 
org.apache.celeborn.service.deploy.worker.storage.PartitionFilesSorter.lambda$new$1(PartitionFilesSorter.java:164)
 ~[celeborn-worker_2.12-0.5.0-SNAPSHOT.jar:0.5.0-SNAPSHOT]
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
~[?:1.8.0_162]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
~[?:1.8.0_162]
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
~[?:1.8.0_162]
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
~[?:1.8.0_162]
        at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_162]
    ```
    
    ### Does this PR introduce _any_ user-facing change?
    
    No.
    
    ### How was this patch tested?
    
    GA.
    
    Closes #2847 from SteNicholas/CELEBORN-1669.
    
    Authored-by: SteNicholas <[email protected]>
    Signed-off-by: mingji <[email protected]>
---
 .../celeborn/service/deploy/worker/storage/PartitionFilesSorter.java | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git 
a/worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/PartitionFilesSorter.java
 
b/worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/PartitionFilesSorter.java
index 86315d329..2cc6c3292 100644
--- 
a/worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/PartitionFilesSorter.java
+++ 
b/worker/src/main/java/org/apache/celeborn/service/deploy/worker/storage/PartitionFilesSorter.java
@@ -459,7 +459,10 @@ public class PartitionFilesSorter extends 
ShuffleRecoverHelper {
 
   @VisibleForTesting
   public void updateSortedShuffleFiles(String shuffleKey, String fileId, long 
fileLength) {
-    sortedShuffleFiles.get(shuffleKey).add(fileId);
+    Set<String> shuffleFiles = sortedShuffleFiles.get(shuffleKey);
+    if (shuffleFiles != null) {
+      shuffleFiles.add(fileId);
+    }
     sortedFileCount.incrementAndGet();
     sortedFilesSize.addAndGet(fileLength);
   }

Reply via email to