[
https://issues.apache.org/jira/browse/SLING-12262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17824365#comment-17824365
]
Joerg Hoh commented on SLING-12262:
-----------------------------------
We scrape metrics via prometheus and have an alertmanager instance to create
alerts from it. If now an instance is not starting up, it's much easier to find
out if repoinit is the culprit if you can query a metric than to search the
logs for the characteristical exception of repoinit. That allows us to refine
the "instance-not-starting-up" alert and convert it into an
"instance-not-starting-up-because-of-repoinit-issues" alert, which is much more
meaningful and which can be handled differently than the generic alert, which
always requires the general triage process.
> Repoinit: report failures via metrics
> -------------------------------------
>
> Key: SLING-12262
> URL: https://issues.apache.org/jira/browse/SLING-12262
> Project: Sling
> Issue Type: Task
> Components: Repoinit
> Affects Versions: Repoinit JCR 1.1.46
> Reporter: Joerg Hoh
> Priority: Major
>
> When a repoinit statement fails (and for that reason the SlingRepository
> service cannot be started, repoinit should expose this as a metric.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)