[ 
https://issues.apache.org/jira/browse/FELIX-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089684#comment-18089684
 ] 

sahvx655-wq commented on FELIX-6844:
------------------------------------

Thanks for the feedback 

I have updated the PR to improve resilience of the event dispatch thread by 
ensuring it does not terminate unexpectedly and by adding handling to avoid 
repeatedly failing listeners impacting processing.

The PR has been updated accordingly. Please let me know if you would like any 
further adjustments.

> Implement self-healing recovery mechanism for FelixDispatchQueue thread in 
> EventDispatcher
> ------------------------------------------------------------------------------------------
>
>                 Key: FELIX-6844
>                 URL: https://issues.apache.org/jira/browse/FELIX-6844
>             Project: Felix
>          Issue Type: Bug
>          Components: Framework
>            Reporter: sahvx655-wq
>            Priority: Major
>
> The {{EventDispatcher}} uses a single background thread 
> ({{{}FelixDispatchQueue{}}}) for asynchronous event delivery. If this thread 
> unexpectedly terminates due to runtime errors (e.g., 
> {{{}OutOfMemoryError{}}}, {{{}StackOverflowError{}}}, or unhandled 
> exceptions), asynchronous event processing permanently stops, leaving the 
> system in a degraded state with no automatic recovery.
> This change introduces a controlled self-healing mechanism to detect a dead 
> dispatcher thread and restart it safely.
> If {{FelixDispatchQueue}} is found to be non-alive during 
> {{{}fireEventAsynchronously(){}}}, the system triggers a controlled restart 
> process. To avoid restart loops, recovery is limited using a maximum retry 
> count, cooldown period, and backoff delay. If the retry limit is exceeded, 
> auto-recovery is disabled and an error is logged to prevent further resource 
> exhaustion.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to