vastian180 opened a new pull request, #3069: URL: https://github.com/apache/celeborn/pull/3069
### What changes were proposed in this pull request? As title, improve PausePushDataTime、PausePushDataAndReplicateTime、pausePushDataCounter metric calculation logic. ### Why are the changes needed? During a stress test, it was found that the `pausePushDataAndReplicateTime` metric value was 55.1 years, which is obviously abnormal. As shown in the figure below.  The reason is as follows: In the process of `ServingState` transition: `NONE PAUSED` -> `PAUSE PUSH` -> `PAUSE PUSH AND REPLICATE` The `pausePushDataAndReplicateStartTime` was not correctly assigned. When `trimCounter >= forceAppendPauseSpentTimeThreshold` or `ServingState` changes from `PAUSE PUSH AND REPLICATE` -> `NONE PAUSED` , the `appendPauseSpentTime` method is executed to update `pausePushDataAndReplicateTime`. The execution logic is `pausePushDataAndReplicateTime += System.currentTimeMillis() - -1L`, which will be displayed as 55.1 years. System.currentTimeMillis()/1000/3600/24/365. Similarly, in the process of `ServingState` transition: `NONE PAUSED` -> `PAUSE PUSH AND REPLICATE` -> `PAUSE PUSH` , the `pausePushDataStartTime` was not correctly assigned. When `trimCounter >= forceAppendPauseSpentTimeThreshold` or `ServingState` changes from `PAUSE PUSH` -> `NONE PAUSED`, the `appendPauseSpentTime` method is executed to update `pausePushDataTime`, which will be displayed as 55.1 years. Modify the logic of `pausePushDataCounter`: The `PAUSE PUSH AND REPLICATE` state includes the worker stopping receiving pushData. Therefore: When `NONE PAUSED` -> `PAUSE PUSH AND REPLICATE`: `pausePushDataCounter` needs to be increased. When `PAUSE PUSH AND REPLICATE` -> `PAUSE PUSH`: `pausePushDataCounter` does not need to be increased. ### Does this PR introduce _any_ user-facing change? NO ### How was this patch tested? Celeborn Dashboard  MemoryManagerSuite#[CELEBORN-882] Test MemoryManager check memory thread logic -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
