[ 
https://issues.apache.org/jira/browse/FELIX-6795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Hoh updated FELIX-6795:
-----------------------------
    Description: 
With FELIX-6663 in place, I have made some observation when it comes to 
healthcheck executions taking too long.

Example:
{noformat}
[Timer-0] org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl 
execution of healthchecks exceeded the timeout value of 5000ms. (Creation of 
descriptors=0ms, execution of the checks=5018ms, total=5018ms)
{noformat}

I found a large amount of instances where the execution of the checks alone 
exceeded the configured limit of 2000ms. The following graph shows the absolut 
number of log messages, where the check time exceeded 2100ms (being 1-2ms over 
2000ms is still acceptable):

 !screenshot-1.png! 

We use healthchecks in many instances to manage their lifecycle, and in most 
cases they behave correctly; but as indicated there are cases where for some 
yet unknown reason the checks can take much longer. I omitted the cases 
exceeding 10 seconds, as they are rare, also then other aspects come into play, 
for example garbage collection etc, which I want to exclude here.



  was:
With FELIX-6663 in place, I have made some observation when it comes to 
healthcheck executions taking too long.

I found a large amount of instances where the execution of the checks alone 
exceeded the configured limit of 2000ms. 



> Healthcheck Executor exeeds allocated timeout
> ---------------------------------------------
>
>                 Key: FELIX-6795
>                 URL: https://issues.apache.org/jira/browse/FELIX-6795
>             Project: Felix
>          Issue Type: Improvement
>          Components: Health Checks
>    Affects Versions: healthcheck.core 2.3.0
>            Reporter: Joerg Hoh
>            Priority: Major
>         Attachments: screenshot-1.png
>
>
> With FELIX-6663 in place, I have made some observation when it comes to 
> healthcheck executions taking too long.
> Example:
> {noformat}
> [Timer-0] org.apache.felix.hc.core.impl.executor.HealthCheckExecutorImpl 
> execution of healthchecks exceeded the timeout value of 5000ms. (Creation of 
> descriptors=0ms, execution of the checks=5018ms, total=5018ms)
> {noformat}
> I found a large amount of instances where the execution of the checks alone 
> exceeded the configured limit of 2000ms. The following graph shows the 
> absolut number of log messages, where the check time exceeded 2100ms (being 
> 1-2ms over 2000ms is still acceptable):
>  !screenshot-1.png! 
> We use healthchecks in many instances to manage their lifecycle, and in most 
> cases they behave correctly; but as indicated there are cases where for some 
> yet unknown reason the checks can take much longer. I omitted the cases 
> exceeding 10 seconds, as they are rare, also then other aspects come into 
> play, for example garbage collection etc, which I want to exclude here.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to