[
https://issues.apache.org/jira/browse/OAK-7504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16483885#comment-16483885
]
Michael Dürig commented on OAK-7504:
------------------------------------
+1, I think this is valuable. We need to understand the overhead though: super
roots are updated (re-written) very frequently so adding a considerable amount
of data might not be viable. As long as we have a way to control the amount of
extra data (e.g. via a configurable sampling interval as proposed in the issue
description) I think we are fine as we can experiment with different settings
and quantify the overhead.
> Include dynamic commit information in the persisted repository data
> -------------------------------------------------------------------
>
> Key: OAK-7504
> URL: https://issues.apache.org/jira/browse/OAK-7504
> Project: Jackrabbit Oak
> Issue Type: Improvement
> Components: segment-tar
> Reporter: Francesco Mari
> Priority: Minor
> Fix For: 1.10
>
>
> The data in the Segment Store doesn't provide any information about the
> dynamic behaviour of the system. For example, who performed the commit? How
> many commits were performed from the same committer?
> In order to simplify debugging the dynamic behaviour of a system, it should
> be possible to store metadata about the commit in the super-root generated by
> that commit. For example, the following information might be attached to the
> super-root:
> * The name of the thread performing the commit. This solution might prove
> expensive in terms of consumed disk space, but would be the most precise tool
> to identify the author of a commit.
> * A hash of the thread name. If storing thread names proves expensive, a hash
> of the thread name can be stored instead. This doesn't allow to exactly
> identify the author of the commit, but would allow us to correlated different
> commits as performed by the same thread.
> * Both the thread name and its hash, with the thread name stored only every
> Nth commit. This solution is not as precise as storing the thread name for
> every commit but, if there is a frequent committer, its thread name will be
> more likely to be sampled, thus providing a precise identity to a thread name
> hash.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)