[
https://issues.apache.org/jira/browse/SLING-8408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836298#comment-16836298
]
Thomas Mueller commented on SLING-8408:
---------------------------------------
Patch for sling-org-apache-sling-distribution-cor:
{noformat}
diff --git
a/src/main/java/org/apache/sling/distribution/monitor/DistributionQueueHealthCheck.java
b/src/main/java/org/apache/sling/distribution/monitor/DistributionQueueHealthCheck.java
index 38bf41e..caffc0d 100644
---
a/src/main/java/org/apache/sling/distribution/monitor/DistributionQueueHealthCheck.java
+++
b/src/main/java/org/apache/sling/distribution/monitor/DistributionQueueHealthCheck.java
@@ -124,8 +124,9 @@ public class DistributionQueueHealthCheck implements
HealthCheck {
} else {
resultLog.debug("No items in queue [{}]",
q.getName());
}
-
- } catch (Exception e) {
+ } catch (IllegalStateException e) {
+ resultLog.healthCheckError("The job index is not
available (just yet) while inspecting replication agent [{}]", queueName);
+ } catch (Exception e) {
resultLog.warn("Exception while inspecting
distribution queue [{}]: {}", queueName, e);
}
}
{noformat}
* Catching IllegalStateException as that's what is thrown by SLING-8407 for the
case where no index is available.
* Report this as a health check error: it means the index is not available,
which can happen at the very first startup, or it could happen later on, if
someone would remove the index. In both cases, the system is not in a good
state, so reporting an error is appropriate. I would expect nobody monitors the
health checks during the very first startup (where the repository is
initialized), but I argue during that time the system is in fact not available.
> DistributionQueueHealthCheck should deal with failing queries
> -------------------------------------------------------------
>
> Key: SLING-8408
> URL: https://issues.apache.org/jira/browse/SLING-8408
> Project: Sling
> Issue Type: Improvement
> Components: Content Distribution
> Reporter: Thomas Mueller
> Priority: Major
>
> The following health check indirectly runs a queries which might fail:
> *
> [DistributionQueueHealthCheck|https://github.com/apache/sling-org-apache-sling-distribution-core/blob/master/src/main/java/org/apache/sling/distribution/monitor/DistributionQueueHealthCheck.java]:
>
> sling-org-apache-sling-distribution-core/src/main/java/org/apache/sling/distribution/monitor
> The call
> [JobManagerImpl.findJobs|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L373],
> which can throw an exception with SLING-8407, if the index is not yet
> available. The health checks should catch this exception and return
> HEALTH_CHECK_ERROR for this case.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)