[ https://issues.apache.org/jira/browse/LUCENE-8551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16687703#comment-16687703 ]
Christophe Bismuth edited comment on LUCENE-8551 at 11/15/18 2:40 PM: ---------------------------------------------------------------------- Sounds challenging, I'd like to work on it! was (Author: cbismuth): Sounds challenging, I'd like to work in it! > Purge unused FieldInfo on segment merge > --------------------------------------- > > Key: LUCENE-8551 > URL: https://issues.apache.org/jira/browse/LUCENE-8551 > Project: Lucene - Core > Issue Type: Improvement > Components: core/index > Reporter: David Smiley > Priority: Major > > If a field is effectively unused (no norms, terms index, term vectors, > docValues, stored value, points index), it will nonetheless hang around in > FieldInfos indefinitely. It would be nice to be able to recognize an unused > FieldInfo and allow it to disappear after a merge (or two). > SegmentMerger merges FieldInfo (from each segment) as nearly the first thing > it does. After that, the different index parts, before it's known what's > "used" or not. After writing, we theoretically know which fields are used or > not, though we're not doing any bookkeeping to track it. Maybe we should > track the fields used during writing so we write a filtered merged fieldInfo > at the end instead of unfiltered up front? Or perhaps upon reading a > segment, we make it cheap/easy for each index type (e.g. terms index, stored > fields, ...) to know which fields have data for the corresponding type. > Then, on a subsequent merge, we know up front to filter the FieldInfos. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org