[
https://issues.apache.org/jira/browse/LUCENE-1761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13959450#comment-13959450
]
Robert Muir commented on LUCENE-1761:
-------------------------------------
{quote}
and honestly shouldn’t be that hard to fix, right? If dead fields are culled as
the segments are merged, this would just fix itself naturally wouldn’t it?
{quote}
Its pretty tricky to fix actually. There is a lot going on here including
concurrency concerns with field numbers. Recycling field numbers would be even
more difficult. The price paid for a mistake here is going to be index
corruption because of how bulk stored fields merges and stuff work.
The risk is high, and the use-case for this is... not clear (in your case as
you describe, it was an app bug). In such a situation I think filtering them
out with something like addIndexes+FieldFilterAtomicReader is an acceptable
workaround.
As far as why the opening is slow, thats specific to lucene 3.x's updatable
norms (separate norms), which I'd bet $20 you arent even using. Unfortunately
the same situation presents itself in SegmentReader due to updatable docvalues:
i committed a comment that will hopefully be addressed:
{code}
// TODO: can we avoid iterating over fieldinfos several times and creating
maps of all this stuff if dv updates do not exist?
{code}
> low level Field metadata is never removed from index
> ----------------------------------------------------
>
> Key: LUCENE-1761
> URL: https://issues.apache.org/jira/browse/LUCENE-1761
> Project: Lucene - Core
> Issue Type: Bug
> Components: core/index
> Affects Versions: 2.2, 2.3, 2.3.1, 2.3.2, 2.4, 2.4.1
> Reporter: Hoss Man
> Priority: Minor
> Labels: gsoc2014
> Attachments: LUCENE-1761.patch
>
>
> with heterogeneous docs, or an index whose fields evolve over time, field
> names that are no longer used (ie: all docs that ever referenced them have
> been deleted) still show up when you use IndexReader.getFieldNames.
> It seems logical that segment merging should only preserve metadata about
> fields that actually existing the new segment, but even after deleting all
> documents from an index and optimizing the old field names are still present.
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]