[ 
https://issues.apache.org/jira/browse/LUCENE-6496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Martijn van Groningen updated LUCENE-6496:
------------------------------------------
    Attachment: LUCENE-6496.patch

After I chatted with Robert, I removed the ImmutableOrdinalMap impl and just 
let MultiDocValues.OrdinalMap implement the OrdinalMap interface. 

Also I moved the UpdatableOrdinalMap to the sandbox module, so the updatable 
impl can be ironed out. For example the updatable ordinal stuff can may also be 
implemented as a DirectoryReader impl.

> Updatable OrdinalMap 
> ---------------------
>
>                 Key: LUCENE-6496
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6496
>             Project: Lucene - Core
>          Issue Type: Bug
>            Reporter: Martijn van Groningen
>            Priority: Minor
>         Attachments: LUCENE-6496.patch, LUCENE-6496.patch
>
>
> The MultiDocValues.OrdinalMap that we have to today requires a rebuild on 
> each reopen. When the OrdinalMap has been built, lookups are fast and the 
> logic is simple. Many time rebuilding the the OrdinalMap isn't even an issue, 
> because for low to medium cardinality fields the rebuilding doesn't take that 
> much time. The time required to build the OrdinalMap depends on the number of 
> unique terms in a field.
> For high cardinality fields (lets say >= 1M terms) rebuilding the OrdinalMap 
> can take some time to complete. This can then impact the NRT aspect of many 
> applications (facets may rely on ordinal maps to be rebuilt before a new 
> search can happen after the reopen).
> I like to explore a different OrdinalMap implementation that doesn't need to 
> be rebuilt on each reopen. There are simple improvements that can made:
> * Lets say docs have only been marked as deleted, then we basically reuse the 
> OrdinalMap that has already been built. 
> * If no new terms have been introduced we can just add segment ordinal to 
> global ordinal lookups to the OrdinalMap that has already been built.
> I think a complete OrdinalMap rebuild is inevitable, but it would be great if 
> we could rebuild on a flush / merge instead of on each reopen.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to