Michael McCandless created LUCENE-5966:
------------------------------------------
Summary: How to migrate from numeric fields to auto-prefix terms
Key: LUCENE-5966
URL: https://issues.apache.org/jira/browse/LUCENE-5966
Project: Lucene - Core
Issue Type: Improvement
Reporter: Michael McCandless
In LUCENE-5879 we are adding auto-prefix terms to the default terms dict, which
is generalized from numeric fields and offers faster performance while using
less indexing space and about the same indexing time.
But there are many users out there with indices already created containing
numeric fields ... so ideally we have some simple way for such users to switch
over to auto-prefix terms.
Robert has a good plan (copied from LUCENE-5879):
Here are some thoughts.
# keep current trie "Encoding" for terms, it just uses precision step=Inf and
lets the term dictionary do it automatically.
# create a filteratomicreader, that for a previous trie encoded field, removes
"fake" terms on merge.
Users could continue to use NumericRangeQuery just with the infinite precision
step, and it will always work, just execute slower for old segments as it
doesnt take advantage of the trie terms that are not yet merged away.
One issue to making it really nice, is that lucene doesnt know for sure that a
field is numeric, so it cannot be "full-auto". Apps would have to use their
schema or whatever to wrap with this reader in their merge policy.
Maybe we could provide some sugar for this, such as a wrapping merge policy
that takes a list of field names that are numeric, or sugar to pass this to IWC
in IndexUpgrader to force it, and so on.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]