[ https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Nicholas Knize updated LUCENE-6930: ----------------------------------- Attachment: LUCENE-6930.patch Attached patch adds the following changes: * Adds a new {{GeoPointTokenStream}} that encodes {{GeoPointField}} type into a maximum of 5 "prefix only" terms. Full precision post filtering uses DocValues. * Adds a {{GeoPointTermQuery.TermEncoding}} enum to choose between existing {{NUMERIC}} encoding or new {{PREFIX}} encoding * Refactors {{GeoPointTermsEnum}} into a base class with two derived classes: ** {{GeoPointNumericTermsEnum}} - Uses existing {{NumericTokenStream}} logic ** {{GeoPointPrefixTermsEnum}} - Uses new {{GeoPointTokenStream}} logic * Refactors term encoding logic out of {{GeoUtils}} into a dedicated {{GeoEncodingUtils}} class * Adds {{randomTermEncoding}} to {{TestGeoPointQuery}} to randomly test both encoding approaches Quick luceneutil benchmarks: 76% reduction in index size, 81% boost in indexing performance, 19% average boost in query performance. > Decouple GeoPointField from NumericType > --------------------------------------- > > Key: LUCENE-6930 > URL: https://issues.apache.org/jira/browse/LUCENE-6930 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Nicholas Knize > Attachments: LUCENE-6930.patch > > > {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix > terms for a GeoPoint using the precision step defined in {{GeoPointField}}. > At search time {{GeoPointTermsEnum}} recurses to a max precision that is > computed by the Query parameters. This max precision is never the full > precision, so creating and indexing the full precision terms is useless and > wasteful (it was always a side effect of just using indexing logic from the > Numeric type). > Furthermore, since the numerical logic always stored high precision terms > first, the recursion in {{GeoPointTermsEnum}} required transient memory for > storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} > and reversing the term order (such that lower resolution terms are first), > the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of > PrefixTerms. This will be done in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org