[
https://issues.apache.org/jira/browse/LUCENE-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Martijn van Groningen updated LUCENE-6496:
------------------------------------------
Attachment: LUCENE-6496.patch
After I chatted with Robert, I removed the ImmutableOrdinalMap impl and just
let MultiDocValues.OrdinalMap implement the OrdinalMap interface.
Also I moved the UpdatableOrdinalMap to the sandbox module, so the updatable
impl can be ironed out. For example the updatable ordinal stuff can may also be
implemented as a DirectoryReader impl.
> Updatable OrdinalMap
> ---------------------
>
> Key: LUCENE-6496
> URL: https://issues.apache.org/jira/browse/LUCENE-6496
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Martijn van Groningen
> Priority: Minor
> Attachments: LUCENE-6496.patch, LUCENE-6496.patch
>
>
> The MultiDocValues.OrdinalMap that we have to today requires a rebuild on
> each reopen. When the OrdinalMap has been built, lookups are fast and the
> logic is simple. Many time rebuilding the the OrdinalMap isn't even an issue,
> because for low to medium cardinality fields the rebuilding doesn't take that
> much time. The time required to build the OrdinalMap depends on the number of
> unique terms in a field.
> For high cardinality fields (lets say >= 1M terms) rebuilding the OrdinalMap
> can take some time to complete. This can then impact the NRT aspect of many
> applications (facets may rely on ordinal maps to be rebuilt before a new
> search can happen after the reopen).
> I like to explore a different OrdinalMap implementation that doesn't need to
> be rebuilt on each reopen. There are simple improvements that can made:
> * Lets say docs have only been marked as deleted, then we basically reuse the
> OrdinalMap that has already been built.
> * If no new terms have been introduced we can just add segment ordinal to
> global ordinal lookups to the OrdinalMap that has already been built.
> I think a complete OrdinalMap rebuild is inevitable, but it would be great if
> we could rebuild on a flush / merge instead of on each reopen.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]