devmadhuu opened a new pull request, #9322:
URL: https://github.com/apache/ozone/pull/9322
## What changes were proposed in this pull request?
This PR fix is to fix following code reliability issues and data integrity
improvements.
**Location: ReconTaskControllerImpl.java**
```
public synchronized void stop() {
LOG.info("Stopping Recon Task Controller.");
if (this.executorService != null) {
this.executorService.shutdownNow(); // No awaitTermination
}
if (this.eventProcessingExecutor != null) {
this.eventProcessingExecutor.shutdownNow(); // No awaitTermination
}
}
```
**Impact: Service reliability, data integrity Likelihood: High (every
service shutdown)**
**Fix:**
`private static final int SHUTDOWN_TIMEOUT_SECONDS = 30;`
```
public synchronized void stop() {
LOG.info("Stopping Recon Task Controller.");
shutdownExecutorGracefully(this.executorService, "main task executor");
shutdownExecutorGracefully(this.eventProcessingExecutor, "event
processing executor");
}
```
```
private void shutdownExecutorGracefully(ExecutorService executor, String
name) {
if (executor == null) return;
executor.shutdown();
try {
if (!executor.awaitTermination(SHUTDOWN_TIMEOUT_SECONDS,
TimeUnit.SECONDS)) {
LOG.warn("Executor {} did not terminate within {} seconds,
forcing shutdown",
name, SHUTDOWN_TIMEOUT_SECONDS);
executor.shutdownNow();
if (!executor.awaitTermination(5, TimeUnit.SECONDS)) {
LOG.error("Executor {} did not terminate after forced
shutdown", name);
}
}
} catch (InterruptedException e) {
LOG.warn("Interrupted while waiting for {} to terminate", name);
executor.shutdownNow();
Thread.currentThread().interrupt();
}
}
```
**Location: OzoneManagerServiceProviderImpl.java:**
`scheduler.shutdownNow(); // No awaitTermination`
**Risk:** Scheduler threads may not terminate, causing resource leaks and
preventing JVM shutdown Impact: Resource exhaustion, service restart failures
**Fix:**
```
private void stopSyncDataFromOMThread() {
scheduler.shutdown();
try {
if (!scheduler.awaitTermination(30, TimeUnit.SECONDS)) {
scheduler.shutdownNow();
if (!scheduler.awaitTermination(5, TimeUnit.SECONDS)) {
LOG.error("OM sync scheduler failed to terminate");
}
}
} catch (InterruptedException e) {
scheduler.shutdownNow();
Thread.currentThread().interrupt();
}
tarExtractor.stop();
LOG.debug("Shutdown the OM DB sync scheduler.");
}
```
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-13956
## How was this patch tested?
This patch is tested with existing junit and integration tests and on local
docker cluster.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]