[
https://issues.apache.org/jira/browse/FINERACT-2449?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Krishna Mewara updated FINERACT-2449:
-------------------------------------
Description:
*Background* The current {{SimpleAsyncTaskExecutor}} creates a new thread for
every task. While effective for light loads, this unbounded behavior poses a
theoretical risk of thread exhaustion (OOM) under specific high-concurrency
scenarios (as originally reported in FINERACT-1934).
*Motivation for Change* Although the specific "thread explosion" from
FINERACT-1934 is difficult to reproduce in standard local/CI environments,
relying on an unbounded executor is contrary to Spring Boot best practices for
production-grade financial systems.
*Goal* Proactively replace the unbounded {{SimpleAsyncTaskExecutor}} with a
bounded, configurable {{{}ThreadPoolTaskExecutor{}}}. This ensures
deterministic resource usage and prevents any future possibility of thread
leaks, regardless of load.
*Proposed Solution*
# Replace {{SimpleAsyncTaskExecutor}} with {{{}ThreadPoolTaskExecutor{}}}.
# Configure safe defaults (Core: CPU{_}2, Max: CPU{_}5).
# Implement robust unit tests (using {{CountDownLatch}} patterns) to
scientifically prove that the new pool respects bounds while maintaining
parallelism.
# Ensure compatibility with all Fineract Modes (Read/Write/Batch).
was:
*Background* The current {{SimpleAsyncTaskExecutor}} creates a new thread for
every task, which can lead to thread exhaustion (accumulating thousands of
threads) under load, as reported in FINERACT-1934.
*The Issue* Simply replacing the executor with a {{ThreadPoolTaskExecutor}} or
applying a concurrency cap to the existing executor causes the application to
crash during startup.
*Technical Constraints* Investigation reveals that the current architecture
relies heavily on {{InheritableThreadLocal}} for security context propagation,
particularly during the boot process (Liquibase execution and initial event
multicasting).
As noted in the source code comments:
{quote}// The application events (for importing) rely on the inheritable thread
local security context strategy // This is NOT compatible with threadpools so
if we use threadpools the below will need to be reworked
{quote}
*Attempts & Failures*
# *ThreadPoolTaskExecutor:* Caused Liquibase context failures because the
security context was not correctly propagated to reused threads.
# *TaskDecorator:* Attempted manual context propagation, but the complexity of
copying all contexts (MDC, Tenancy, Transaction) proved brittle and blocked
startup events.
# *Concurrency Cap:* Limiting the {{SimpleAsyncTaskExecutor}} caused
deadlocks/timeouts during startup because the boot process requires high
parallelism (100+ concurrent threads).
*Proposed Improvement* We need to redesign the threading model to:
# Decouple the startup/bootstrapping phase (which may require unbounded
threads) from the runtime phase.
# Implement a safe mechanism for Context Propagation that is compatible with
pooling (e.g., using {{TransmittableThreadLocal}} or a robust
{{{}TaskDecorator{}}}).
# Migrate to a bounded {{ThreadPoolTaskExecutor}} for runtime tasks to prevent
resource exhaustion.
Summary: Replace unbounded SimpleAsyncTaskExecutor with bounded
ThreadPoolTaskExecutor (was: Rework Async Threading Model to support Context
Propagation and Pooling)
> Replace unbounded SimpleAsyncTaskExecutor with bounded ThreadPoolTaskExecutor
> -----------------------------------------------------------------------------
>
> Key: FINERACT-2449
> URL: https://issues.apache.org/jira/browse/FINERACT-2449
> Project: Apache Fineract
> Issue Type: Improvement
> Components: Performance
> Affects Versions: 1.14.0
> Environment: Reproducible on all environments (Local Development,
> Docker, and Kubernetes).
> Platform:
> - OS: Linux / macOS / Windows (OS Agnostic)
> - Java Version: 17+ (Standard Fineract Runtime)
> - Framework: Spring Boot / Apache Fineract 1.x -> 1.14
> Infrastructure:
> - Issue is critical in Containerized/Kubernetes environments where thread
> exhaustion leads to Pod Eviction/OOMKilled.
> - High concurrency scenarios (>200 concurrent users) trigger the thread
> accumulation.
> Reporter: Krishna Mewara
> Priority: Major
> Labels: improvement, performance, technical-debt, threading
>
> *Background* The current {{SimpleAsyncTaskExecutor}} creates a new thread for
> every task. While effective for light loads, this unbounded behavior poses a
> theoretical risk of thread exhaustion (OOM) under specific high-concurrency
> scenarios (as originally reported in FINERACT-1934).
> *Motivation for Change* Although the specific "thread explosion" from
> FINERACT-1934 is difficult to reproduce in standard local/CI environments,
> relying on an unbounded executor is contrary to Spring Boot best practices
> for production-grade financial systems.
> *Goal* Proactively replace the unbounded {{SimpleAsyncTaskExecutor}} with a
> bounded, configurable {{{}ThreadPoolTaskExecutor{}}}. This ensures
> deterministic resource usage and prevents any future possibility of thread
> leaks, regardless of load.
> *Proposed Solution*
> # Replace {{SimpleAsyncTaskExecutor}} with {{{}ThreadPoolTaskExecutor{}}}.
> # Configure safe defaults (Core: CPU{_}2, Max: CPU{_}5).
> # Implement robust unit tests (using {{CountDownLatch}} patterns) to
> scientifically prove that the new pool respects bounds while maintaining
> parallelism.
> # Ensure compatibility with all Fineract Modes (Read/Write/Batch).
--
This message was sent by Atlassian Jira
(v8.20.10#820010)