This is an automated email from the ASF dual-hosted git repository.
dchen pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/samza.git
The following commit(s) were added to refs/heads/master by this push:
new 5b4b1b70b Improve Samza AM retry count logging (#1701)
5b4b1b70b is described below
commit 5b4b1b70b824c1d9ab97428b1bf0c50c1cf7818b
Author: Michael Barskii <[email protected]>
AuthorDate: Thu Aug 8 11:05:33 2024 -0700
Improve Samza AM retry count logging (#1701)
* LISAMZA-43659 Improve Samza AM retry count logging
* Output Current Timestamp at run-class.sh script (#1702)
* print current timestamp
* Fix typo
* fix build issue about grolifant okhttp
---------
Co-authored-by: Haolan Ye <[email protected]>
* Revert "Output Current Timestamp at run-class.sh script (#1702)"
This reverts commit 1e84ac05eb70d7b1880bb2a62a03e5e1a4b3ef16.
---------
Co-authored-by: Michael Barskii <[email protected]>
Co-authored-by: Haolan Ye <[email protected]>
Co-authored-by: Haolan Ye <[email protected]>
---
.../org/apache/samza/clustermanager/ContainerProcessManager.java | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git
a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
index f8719890e..756241f06 100644
---
a/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
+++
b/samza-core/src/main/java/org/apache/samza/clustermanager/ContainerProcessManager.java
@@ -572,9 +572,10 @@ public class ContainerProcessManager implements
ClusterResourceManager.Callback
// if fail count is (1 initial failure + max retries) then fail job.
if (currentFailCount > retryCount) {
- LOG.error("Processor ID: {} (current Container ID: {}) has failed {}
times, with last failure {} ms ago. " +
- "This is greater than retry count of {} and window of {} ms, ",
- processorId, containerId, currentFailCount,
durationSinceLastRetryMs, retryCount, retryWindowMs);
+ LOG.error("Processor ID: {} (current Container ID: {}) has failed {}
times. "
+ + "This is greater that the retry count of {}."
+ + "The failure occurred {} ms after the previous one, which is
less than the retry window of {} ms.",
+ processorId, containerId, currentFailCount, retryCount,
durationSinceLastRetryMs, retryWindowMs);
// We have too many failures, and we're within the window
// boundary, so reset shut down the app master.