[
https://issues.apache.org/jira/browse/LUCENE-9406?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279289#comment-17279289
]
Zach Chen edited comment on LUCENE-9406 at 2/5/21, 2:56 AM:
------------------------------------------------------------
Thanks Michael and Andrzej for the feedback here!
+1 for simplification! When I proposed the interfaces above, I was also
considering a way to consume the logged metrics there. Are we considering
writing them out to some default file location with the *IndexWriterMetrics*
interface above, or will something like the following work as well?
{code:java}
interface IndexWriterMetrics {
beginMergeOnFullFlush(OneMerge merge);
endMergeOnFullFlush(OneMerge merge);
...
Map<MetricName, Metric> providesMetrics();
}
{code}
was (Author: zacharymorn):
Thanks Michael and Andrzej for the feedback here!
+1 for simplification! When I proposed the interfaces above, I was also
considering a way to consume the logged metrics there. Are we considering
writing them out to some default file location with the *IndexWriterMetrics*
interface above, or will something like the following work as well?
{code:java}
interface IndexWriterMetrics {
beginMergeOnFullFlush(OneMerge merge);
endMergeOnFullFlush(OneMerge merge);
...
Map<MetricName, Metric> providesMetrics();
}
{code}
> Make it simpler to track IndexWriter's events
> ---------------------------------------------
>
> Key: LUCENE-9406
> URL: https://issues.apache.org/jira/browse/LUCENE-9406
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/index
> Reporter: Michael McCandless
> Priority: Major
>
> This is the second spinoff from a [controversial PR to add a new index-time
> feature to Lucene to merge small segments during
> commit|https://github.com/apache/lucene-solr/pull/1552]. That change can
> substantially reduce the number of small index segments to search.
> In that PR, there was a new proposed interface, {{IndexWriterEvents}}, giving
> the application a chance to track when {{IndexWriter}} kicked off merges
> during commit, how many, how long it waited, how often it gave up waiting,
> etc.
> Such telemetry from production usage is really helpful when tuning settings
> like which merges (e.g. a size threshold) to attempt on commit, and how long
> to wait during commit, etc.
> I am splitting out this issue to explore possible approaches to do this.
> E.g. [~simonw] proposed using a statistics class instead, but if I understood
> that correctly, I think that would put the role of aggregation inside
> {{IndexWriter}}, which is not ideal.
> Many interesting events, e.g. how many merges are being requested, how large
> are they, how long did they take to complete or fail, etc., can be gleaned by
> wrapping expert Lucene classes like {{MergePolicy}} and {{MergeScheduler}}.
> But for those events that cannot (e.g. {{IndexWriter}} stopped waiting for
> merges during commit), it would be very helpful to have some simple way to
> track so applications can better tune.
> It is also possible to subclass {{IndexWriter}} and override key methods, but
> I think that is inherently risky as {{IndexWriter}}'s protected methods are
> not considered to be a stable API, and the synchronization used by
> {{IndexWriter}} is confusing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]