zyclove opened a new issue, #3520: URL: https://github.com/apache/celeborn/issues/3520
celeborn 0.6.1 on k8s <img width="1766" height="819" alt="Image" src="https://github.com/user-attachments/assets/d2618fed-48f9-4873-8b34-edc5d1b4daff" /> 2025-10-30T08:35:17.350245825Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms. config : celeborn: celeborn.master.ha.enabled: true celeborn.metrics.enabled: true celeborn.storage.availableTypes: MEMORY,SSD celeborn.worker.storage.storagePolicy.createFilePolicy: MEMORY,SSD celeborn.metrics.prometheus.path: /metrics/prometheus celeborn.master.http.port: 9098 celeborn.worker.http.port: 9096 celeborn.worker.monitor.disk.enabled: true celeborn.shuffle.chunk.size: 8m celeborn.rpc.io.serverThreads: 32 celeborn.rpc.io.numConnectionsPerPeer: 2 celeborn.rpc.io.clientThreads: 32 celeborn.rpc.dispatcher.numThreads: 8 celeborn.worker.fetch.io.threads: 32 celeborn.worker.push.io.threads: 32 celeborn.push.stageEnd.timeout: 120s celeborn.application.heartbeat.timeout: 120s celeborn.worker.heartbeat.timeout: 120s celeborn.worker.commitFiles.threads: 32 celeborn.worker.commitFiles.timeout: 240s celeborn.worker.flusher.buffer.size: 8m celeborn.master.slot.assign.policy: loadaware # celeborn.worker.flusher.hdfs.threads: 64 # celeborn.worker.flusher.hdfs.buffer.size: 64m # celeborn.storage.hdfs.dir: hdfs://ha-nn-uri/celeborn # celeborn.worker.storage.disk.reserve.size: 100G # -- Celeborn environment variables log more: ``` 6,107 INFO [worker-expired-shuffle-cleaner] ChunkStreamManager: Cleaned up expired shuffle keys. The count of shuffle keys and streams: 9, 24","offset":4710459,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:50.345581123+00:00"}2025-10-30 16:35:50.378{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:46.107794438Z stdout F 25/10/30 16:35:46,107 INFO [worker-expired-shuffle-cleaner] ChunkStreamManager: Clean up expired shuffle keys ","offset":4710308,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:50.345564336+00:00"}2025-10-30 16:35:50.378{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:46.034397658Z stdout F DiskInfo(maxSlots: 0, availableSlots: 0, committed shuffles 18, running applications 13, shuffleAllocations: Map(spark-ed67a36c0 b52403396c243998bccdb4d-0 -> 30, spark-d2b9d02683fa4befb2837700ed8102c3-1 -> 8, spark-5c868261d80c4532841cdfe324dbe09a-0 -> 79, spark-31da066acfb646869bc8a7d67fc02eef-0 -> 8, spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 -> 8, spark-1d8ea77c34c14c84ab5fcaa5a8b182d1-2 -> 9, spark-b7ff4fc65887425b84f64d1a3ee822c1-2 -> 1, spark-ab66f5b1510b476cb8be9dc29eb72dd4-0 -> 7, spark-8c03135671a74c689677481b987ab7fb-1 -> 16, spark-0a31d244de9c4c6a87e5a441119a299a-9 -> 245), mountPoint: /data2, usableSpace: 554.7 GiB, totalSpace: 786.4 GiB, avgFlushTime: 15.4 ms, avgFetchTime: 326.9 ms, activeSlots: 411, storageType: HDD) status: HEALTHY dirs /data2/celeborn-worker/shuffle_data","offset":4709472,"pod":"celeborn-worker-1...80 more2025-10-30 16:35:50.378{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:46.034395438Z stdout F DiskInfo(maxSlots: 0, availableSlots: 0, committed shuffles 18, running applications 13, shu ffleAllocations: Map(spark-ed67a36c0b52403396c243998bccdb4d-0 -> 33, spark-d2b9d02683fa4befb2837700ed8102c3-1 -> 4, spark-5c868261d80c4532841cdfe324dbe09a-0 -> 82, spark-31da066acfb646869bc8a7d67fc02eef-0 -> 4, spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 -> 8, spark-1d8ea77c34c14c84ab5fcaa5a8b182d1-2 -> 14, spark-b7ff4fc65887425b84f64d1a3ee822c1-2 -> 6, spark-ab66f5b1510b476cb8be9dc29eb72dd4-0 -> 8, spark-8c03135671a74c689677481b987ab7fb-1 -> 15, spark-0a31d244de9c4c6a87e5a441119a299a-9 -> 214), mountPoint: /data1, usableSpace: 554.7 GiB, totalSpace: 786.4 GiB, avgFlushTime: 7.8 ms, avgFetchTime: 165.5 ms, activeSlots: 388, storageType: HDD) status: HEALTHY dirs /data1/celeborn-worker/shuffle_data","offset":4708636,"pod":"celeborn-worker-1...80 more2025-10-30 16:35:50.378{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:46.034377414Z stdout F 25/10/30 16:35:46,034 INFO [worker-forward-message-sched uler] StorageManager: Updated diskInfos:","offset":4708499,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:50.345510869+00:00"}2025-10-30 16:35:50.378{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:42.653227984Z stdout F 25/10/30 16:35:42,653 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 1816.1 MiB/7.0 GiB, disk buffer size: 749.3 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4708231,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:50.345492232+00:00"}2025-10-30 16:35:35.625{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:32.653067369Z stdout F 25/10/30 16:35:32,652 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 1816.1 MiB/7.0 GiB , disk buffer size: 745.5 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4707963,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:35.608865062+00:00"}2025-10-30 16:35:32.992{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:29.205113175Z stdout F 25/10/30 16:35:29,205 INFO [worker-rpc-async-replier] Controller: CommitFiles for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-0 success with 16 committed primary partitions 183-0,72-0,13-0,127-0,23-0,34-0,5-0,99-0,85-0,141-0,169-0,46-0,155-0,113-0,59-0,197-0, 0 empty primary partitions , 0 failed primary partitions , 0 committed replica partitions , 0 empty replica partitions , 0 failed replica partitions .","offset":4707517,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:32.916780225+00:00"}2025-10-30 16:35:25.766{"host":"filebeat-prod-a- rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:23.443997277Z stdout F 25/10/30 16:35:23,443 INFO [celeborn-dispatcher-1] Controller: Reserved 1 primary location 124-0 and 0 replica location for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 ","offset":4707310,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761770573+00:00"}2025-10-30 16:35:25.766{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:23.443906677Z stdout F 25/10/30 16:35:23,443 INFO [celeborn-dispatcher-1] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/124-0-0","offset":4707102,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761750643+00:00"}2025-10-30 16:35:25.766{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"1 45912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:22.652882693Z stdout F 25/10/30 16:35:22,652 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 3.5 GiB/7.0 GiB, disk buffer size: 438.6 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4706837,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761714514+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391763654Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] Controller: Reserved 16 primary location 198-0,184-0,170-0,156-0,142-0,128-0,114-0,100-0,86-0,73-0,60-0,47-0,35-0,24-0,13-0,5-0 and 0 replica location for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 ","offset":4706548,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688498565+00:00"}2025- 10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391716903Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/5-0-0","offset":4706342,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688480840+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391659641Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/13-0-0","offset":4706135,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688463756+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq", "k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391613127Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/24-0-0","offset":4705928,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688446437+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391553949Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/35-0-0","offset":4705721,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688426955+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0- 05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391513695Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/47-0-0","offset":4705514,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688409922+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391452477Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/60-0-0","offset":4705307,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688392967+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10- 30T08:35:20.391403007Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/73-0-0","offset":4705100,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688375337+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391348623Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/86-0-0","offset":4704893,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688357035+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39129975Z stdout F 25/10/30 16:35:20, 391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/100-0-0","offset":4704686,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688329069+00:00"}2025-10-30 16:35:22.732{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39124118Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/114-0-0","offset":4704479,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688311757+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39120686Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/128-0-0","offset":4704272,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688292614+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391133186Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/142-0-0","offset":4704064,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688275276+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391100595Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_ data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/156-0-0","offset":4703856,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688257222+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391017644Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/170-0-0","offset":4703648,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688239698+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.390944288Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/1 84-0-0","offset":4703440,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688221780+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.390790179Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/198-0-0","offset":4703232,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688203164+00:00"}2025-10-30 16:35:22.731{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.052953207Z stdout F 25/10/30 16:35:20,052 INFO [celeborn-dispatcher-4] Controller: Start commitFiles for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-0, primaryIds : 141-0,113-0,46-0,183-0,72-0,155-0,59-0,85-0,23-0,127-0,13-0,34-0,197 -0,169-0,5-0,99-0, replicaIds : ","offset":4702951,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688183085+00:00"}2025-10-30 16:35:19.672{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.369811091Z stdout F 25/10/30 16:35:17,369 INFO [worker-rpc-async-replier] Controller: CommitFiles for spark-b943e38f2d204caa9348c7d0fd1508e5-3 success with 18 committed primary partitions 180-0,194-0,82-0,39-0,138-0,10-0,2-0,31-0,20-0,110-0,43-0,152-0,96-0,162-0,56-0,124-0,166-0,69-0, 0 empty primary partitions , 0 failed primary partitions , 0 committed replica partitions , 0 empty replica partitions , 0 failed replica partitions .","offset":4702494,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592442760+00:00"}2025-10-30 16:35:19.672{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293 -0c808add24d2"},"message":"2025-10-30T08:35:17.350245825Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4702258,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592425595+00:00"}2025-10-30 16:35:19.672{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350210606Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4702022,"pod":"celeborn-worker-14","tags":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:32.653067369Z stdout F 25/10/30 16:35:32,652 INFO [worker-memory-manager-repor ter] MemoryManager: Direct memory usage: 1816.1 MiB/7.0 GiB, disk buffer size: 745.5 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4707963,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:35.608865062+00:00"} | | | 2025-10-30 16:35:32.992 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:29.205113175Z stdout F 25/10/30 16:35:29,205 INFO [worker-rpc-async-replier] Controller: CommitFiles for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-0 success with 16 committed primary partitions 183-0,72-0,13-0,127-0,23-0,34-0,5-0,99-0,85-0,141-0,169-0,46-0,155-0,113-0,59-0,197-0, 0 empty primary partitions , 0 failed primary partitions , 0 committed replica partitions , 0 empty replica partitions , 0 failed replica partitions .","offset":4707517,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:32.916780225+00:00"} | | | 2025-10-30 16:35:25.766 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:23.443997277Z stdout F 25/10/30 16:35:23,443 INFO [celeborn-dispatcher-1] Controller: Reserved 1 primary location 124-0 and 0 replica location for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 ","offset":4707310,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761770573+00:00"} | | | 2025-10-30 16:35:25.766 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:23.443906677Z stdout F 25/10/30 16:35:23,443 INFO [celeborn-dispatcher-1] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/124-0-0","offset":4707102,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761750643+00:00"} | | | 2025-10-30 16:35:25.766 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:22.652882693Z stdout F 25/10/30 16:35:22,652 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 3.5 GiB/7.0 GiB, disk buffer size: 438.6 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4706837,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:25.761714514+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391763654Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] Controller: Reserved 16 primary location 198-0,184-0,170-0,156-0,142-0,128-0,114-0,100-0,86-0,73-0,60-0,47-0,35-0,24-0,13-0,5-0 and 0 replica location for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-1 ","offset":4706548,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688498565+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391716903Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/5-0-0","offset":4706342,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688480840+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391659641Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/13-0-0","offset":4706135,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688463756+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391613127Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/24-0-0","offset":4705928,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688446437+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391553949Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/35-0-0","offset":4705721,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688426955+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391513695Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/47-0-0","offset":4705514,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688409922+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391452477Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/60-0-0","offset":4705307,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688392967+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391403007Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/73-0-0","offset":4705100,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688375337+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391348623Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/86-0-0","offset":4704893,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688357035+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39129975Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/100-0-0","offset":4704686,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688329069+00:00"} | | | 2025-10-30 16:35:22.732 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39124118Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/114-0-0","offset":4704479,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688311757+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.39120686Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/128-0-0","offset":4704272,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688292614+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391133186Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/142-0-0","offset":4704064,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688275276+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391100595Z stdout F 25/10/30 16:35:20,391 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/156-0-0","offset":4703856,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688257222+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.391017644Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/170-0-0","offset":4703648,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688239698+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.390944288Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/184-0-0","offset":4703440,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688221780+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.390790179Z stdout F 25/10/30 16:35:20,390 INFO [celeborn-dispatcher-2] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/1/198-0-0","offset":4703232,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688203164+00:00"} | | | 2025-10-30 16:35:22.731 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:20.052953207Z stdout F 25/10/30 16:35:20,052 INFO [celeborn-dispatcher-4] Controller: Start commitFiles for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-0, primaryIds : 141-0,113-0,46-0,183-0,72-0,155-0,59-0,85-0,23-0,127-0,13-0,34-0,197-0,169-0,5-0,99-0, replicaIds : ","offset":4702951,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:22.688183085+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.369811091Z stdout F 25/10/30 16:35:17,369 INFO [worker-rpc-async-replier] Controller: CommitFiles for spark-b943e38f2d204caa9348c7d0fd1508e5-3 success with 18 committed primary partitions 180-0,194-0,82-0,39-0,138-0,10-0,2-0,31-0,20-0,110-0,43-0,152-0,96-0,162-0,56-0,124-0,166-0,69-0, 0 empty primary partitions , 0 failed primary partitions , 0 committed replica partitions , 0 empty replica partitions , 0 failed replica partitions .","offset":4702494,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592442760+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350245825Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4702258,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592425595+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350210606Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4702022,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592407549+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.35018214Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4701787,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592390125+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350069252Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4701551,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592373050+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350050583Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4701315,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592355608+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.350029593Z stdout F 25/10/30 16:35:17,350 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4701079,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592329196+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349941234Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700843,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592311950+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349849629Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700607,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592293509+00:00"} | 2025-10-30 16:35:19.672{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.34975143Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700372,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592274796+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.34975143Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700372,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T0 8:35:19.592274796+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.34975143Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700372,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592274796+00:00"} | 2025-10-30 16:35:19.672{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349748197Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700136,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592253835+00:00"}2025-10-30 16:35:19.671{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349696515Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699900,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593418 973+00:00"}2025-10-30 16:35:19.671{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349671124Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699664,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593401788+00:00"}2025-10-30 16:35:19.671{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349638339Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699428,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35: 19.593384284+00:00"}2025-10-30 16:35:19.671{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.309020961Z stdout F 25/10/30 16:35:17,308 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699192,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593367027+00:00"}2025-10-30 16:35:18.674{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.032535359Z stdout F 25/10/30 16:35:16,032 INFO [worker-expired-shuffle-cleaner] ChunkStreamManager: Clean up expired shuffle keys spark-5ab33047e6194ad0b14c1ad907fa37fd-0","offset":4695013,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.601029665+00:00"}2025-10-30 16:35 :18.674{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.03251913Z stdout F 25/10/30 16:35:16,032 INFO [worker-expired-shuffle-cleaner] Worker: Cleaned up expired shuffle spark-5ab33047e6194ad0b14c1ad907fa37fd-0","offset":4694838,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.601012215+00:00"}2025-10-30 16:35:18.674{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:15.957984141Z stdout F DiskInfo(maxSlots: 0, availableSlots: 0, committed shuffles 17, running applications 13, shuffleAllocations: Map(spark-ed67a36c0b52403396c243998bccdb4d-0 -> 30, spark-d2b9d02683fa4befb2837700ed8102c3-1 -> 8, spark-5c868261d80c4532841cdfe324dbe09a-0 -> 79, spark-31da066acfb646869bc8a7d67fc02eef-0 -> 8, spark-1d8ea77c34c14c84ab5fcaa5a8b182d1-2 -> 9, spark-b7ff4fc65887425b84f64d1a3ee822 c1-2 -> 1, spark-ab66f5b1510b476cb8be9dc29eb72dd4-0 -> 7, spark-8c03135671a74c689677481b987ab7fb-1 -> 16, spark-0a31d244de9c4c6a87e5a441119a299a-9 -> 245, spark-b943e38f2d204caa9348c7d0fd1508e5-3 -> 8), mountPoint: /data2, usableSpace: 555.3 GiB, totalSpace: 786.4 GiB, avgFlushTime: 15.4 ms, avgFetchTime: 315.7 ms, activeSlots: 411, storageType: HDD) status: HEALTHY dirs /data2/celeborn-worker/shuffle_data","offset":4694002,"pod":"celeborn-worker-1...80 more2025-10-30 16:35:18.674{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:15.957982022Z stdout F DiskInfo(maxSlots: 0, availableSlots: 0, committed shuffles 17, running applications 13, shuffleAllocations: Map(spark-ed67a36c0b52403396c243998bccdb4d-0 -> 33, spark-d2b9d02683fa4befb2837700ed8102c3-1 -> 4, spark-5c868261d80c4532841cdfe324dbe09a-0 -> 82, spark-31da066acfb646869bc8a7d67fc02eef-0 -> 4, spark-1d8ea77c34c14c84ab5fcaa5a8b182d1-2 -> 14, spark-b7ff4fc65887425b84f64d1a3ee822c1-2 -> 6, spark-ab66f5b1510b476cb8be9dc29eb72dd4-0 -> 8, spark-8c03135671a74c689677481b987ab7fb-1 -> 15, spark-0a31d244de9c4c6a87e5a441119a299a-9 -> 214, spark-b943e38f2d204caa9348c7d0fd1508e5-3 -> 8), mountPoint: /data1, usableSpace: 555.2 GiB, totalSpace: 786.4 GiB, avgFlushTime: 7.8 ms, avgFetchTime: 161.2 ms, activeSlots: 388, storageType: HDD) status: HEALTHY dirs /data1/celeborn-worker/shuffle_data","offset":4693166,"pod":"celeborn-worker-1...80 more2025-10-30 16:35:18.674{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:15.95796217Z stdout F 25/10/30 16:35:15,957 INFO [worker-forward-message-scheduler] StorageManager: Updated diskInfos:","offset":4693030,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.600955264+00:00"}2025-10-30 16:35:18.568{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"1 45912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.72570289Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] Controller: Reserved 16 primary location 197-0,183-0,169-0,155-0,141-0,127-0,113-0,99-0,85-0,72-0,59-0,46-0,34-0,23-0,13-0,5-0 and 0 replica location for spark-0c47a0fbd5e040fc83ce8db93a8c95bd-0 ","offset":4698905,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564174306+00:00"}2025-10-30 16:35:18.568{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.72564417Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/5-0-0","offset":4698700,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564156081+00:00"}2025-10-30 16:35:18.568{"host":"filebeat-prod-a-rx9tq","k8s":{"name space":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725598381Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/13-0-0","offset":4698493,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564137528+00:00"}2025-10-30 16:35:18.568{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725547167Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/23-0-0","offset":4698286,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564119957+00:00"}2025-10-30 16:35:18.568{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b2 93-0c808add24d2"},"message":"2025-10-30T08:35:16.725502932Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/34-0-0","offset":4698079,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564101570+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725441035Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/46-0-0","offset":4697872,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564084445+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16. 725402942Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/59-0-0","offset":4697665,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564066014+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725340624Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/72-0-0","offset":4697458,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564047351+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725293024Z stdout F 25/10/30 16:35:16,725 INFO [c eleborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/85-0-0","offset":4697251,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.564027822+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.72524026Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/99-0-0","offset":4697045,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563978363+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725188821Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/113-0-0","offset":4696837,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563960698+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725131585Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/127-0-0","offset":4696629,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563941761+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725092118Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0 c47a0fbd5e040fc83ce8db93a8c95bd/0/141-0-0","offset":4696421,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563923924+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.725022827Z stdout F 25/10/30 16:35:16,725 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/155-0-0","offset":4696213,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563905148+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.724924613Z stdout F 25/10/30 16:35:16,724 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/169-0-0","off set":4696005,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563887637+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.724872471Z stdout F 25/10/30 16:35:16,724 INFO [celeborn-dispatcher-6] StorageManager: created file at /data1/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/183-0-0","offset":4695797,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563869669+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.718706029Z stdout F 25/10/30 16:35:16,718 INFO [celeborn-dispatcher-6] StorageManager: created file at /data2/celeborn-worker/shuffle_data/spark-0c47a0fbd5e040fc83ce8db93a8c95bd/0/197-0-0","offset":4695589,"pod":"celeborn-worker-14","tags": ["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563852266+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.032860736Z stdout F 25/10/30 16:35:16,032 INFO [worker-expired-shuffle-cleaner-3743] StorageManager: Cleanup expired shuffle spark-5ab33047e6194ad0b14c1ad907fa37fd-0.","offset":4695402,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563833997+00:00"}2025-10-30 16:35:18.567{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:16.032541746Z stdout F 25/10/30 16:35:16,032 INFO [worker-expired-shuffle-cleaner] ChunkStreamManager: Cleaned up expired shuffle keys. The count of shuffle keys and streams: 7, 11","offset":4695204,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:18.563813684+00:00"}2025-10- 30 16:35:15.691{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:12.652708887Z stdout F 25/10/30 16:35:12,652 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 4.5 GiB/7.0 GiB, disk buffer size: 1681.4 MiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4692764,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:15.637036582+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.989385578Z stdout F 25/10/30 16:35:07,989 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4692528,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-1 0-30T08:35:12.327219698+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.989374821Z stdout F 25/10/30 16:35:07,989 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4692292,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:12.327202995+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.989316491Z stdout F 25/10/30 16:35:07,989 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4692056,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts ":"2025-10-30T08:35:12.327186019+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.98927161Z stdout F 25/10/30 16:35:07,989 INFO [worker-rpc-async-replier] Controller: CommitFiles for spark-b943e38f2d204caa9348c7d0fd1508e5-4 success with 12 committed primary partitions 170-0,72-0,128-0,142-0,86-0,184-0,45-0,100-0,156-0,58-0,114-0,198-0, 0 empty primary partitions , 0 failed primary partitions , 0 committed replica partitions , 0 empty replica partitions , 0 failed replica partitions .","offset":4691629,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:12.327169405+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.989269208Z stdout F 25/10/30 16:35:07,989 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4691393,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:12.327152291+00:00"}2025-10-30 16:35:12.382{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:07.989244228Z stdout F 25/10/30 16:35:07,989 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4691157,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:12.327116877+00:00"}2025-10-30 16:35:05.479{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:03.74081162Z stdout F 25/10/30 16:35:03,740 INFO [celeborn-dispatcher-0] Controller: Start c ommitFiles for spark-b943e38f2d204caa9348c7d0fd1508e5-3, primaryIds : 180-0,110-0,56-0,162-0,82-0,43-0,20-0,69-0,152-0,124-0,10-0,31-0,194-0,39-0,96-0,166-0,2-0,138-0, replicaIds : ","offset":4690866,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:05.389181692+00:00"}2025-10-30 16:35:04.733{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:02.652517512Z stdout F 25/10/30 16:35:02,652 INFO [worker-memory-manager-reporter] MemoryManager: Direct memory usage: 5.3 GiB/7.0 GiB, disk buffer size: 2.1 GiB, sort memory size: 0.0 B, read buffer size: 0.0 B, memory file storage size : 0.0 B","offset":4690603,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:04.687044079+00:00"}2025-10-30 16:35:01.626{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:0 0.011630961Z stdout F 25/10/30 16:35:00,011 INFO [worker-memory-manager-checker] ChannelsLimiter: push channels resume read.","offset":4690460,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:01.595125169+00:00"}2025-10-30 16:35:01.626{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:00.01162596Z stdout F 25/10/30 16:35:00,011 INFO [worker-memory-manager-checker] MemoryManager: Trigger action: RESUME PUSH","offset":4690319,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:01.595097343+00:00"}2025-10-30 16:35:01.626{"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:00.011620271Z stdout F 25/10/30 16:35:00,011 INFO [worker-memory-manager-checker] MemoryManager: Serving state transformed from PUSH_PAUSED to NONE_PAUSED","offset":4690147,"pod":" celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:01.595068804+00:00"} | | | 2025-10-30 16:35:19.672 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349748197Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4700136,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.592253835+00:00"} | | | | 2025-10-30 16:35:19.671 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349696515Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions time out after 120000 ms.","offset":4699900,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593418973+00:00"} | | | | 2025-10-30 16:35:19.671 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349671124Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699664,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593401788+00:00"} | | | | 2025-10-30 16:35:19.671 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.349638339Z stdout F 25/10/30 16:35:17,349 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@133 3945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699428,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593384284+00:00"} | | | | 2025-10-30 16:35:19.671 | {"host":"filebeat-prod-a-rx9tq","k8s":{"namespace":"celeborn","pod_id":"145912d0-05f9-418a-b293-0c808add24d2"},"message":"2025-10-30T08:35:17.309020961Z stdout F 25/10/30 16:35:17,308 WARN [worker-memory-manager-actor] StorageManager: Skip flushOnMemoryPressure because LocalFlusher@1333945918-/data2 has error: Wait pending actions timeout after 120000 ms.","offset":4699192,"pod":"celeborn-worker-14","tags":["smart","","prod"],"vector_ts":"2025-10-30T08:35:19.593367027+00:00"} | | | | 2025-10-30 16:35:18.674 | {"host":"filebeat-prod-a-rx9tq","k8s": ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
