[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-02-04 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Fix Version/s: trunk
   5.5

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Nicholas Knize
> Fix For: 5.5, trunk
>
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-02-01 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Component/s: modules/spatial

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>  Components: modules/spatial
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-28 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

Updated patch to include review feedback.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-27 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

Corrected patch

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-25 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

Updated patch to include the following:

* incorporate review feedback
* override {{GeoPointPrefixTermsEnum.accept}} to "seek" to the  "floor" range 
of the candidate term. This boosts query performance by eliminating superfluous 
range visits.
* fixed bug in {{GeoEncodingUtils.geoCodedToPrefixCodedBytes}} and 
{{.getPrefixCodedLongShift}} that was ignoring the {{BytesRef.offset}} variable

I'm going to open up another query performance improvement issue that switches 
from comparing BytesRefs to directly comparing the long encoded range values. 
This will instead convert candidate terms to their encoded range values and 
eliminate the need for constantly converting ranges to BytesRefs for 
comparisons.

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch, 
> LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-21 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

Thanks for the great feedback [~mikemccand]  Patch updated with 
recommendations. 

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch, LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2016-01-18 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Attachment: LUCENE-6930.patch

Attached patch adds the following changes:

* Adds a new {{GeoPointTokenStream}} that encodes {{GeoPointField}} type into a 
maximum of 5 "prefix only" terms. Full precision post filtering uses DocValues.
* Adds a {{GeoPointTermQuery.TermEncoding}} enum to choose between existing 
{{NUMERIC}} encoding or new {{PREFIX}} encoding 
* Refactors {{GeoPointTermsEnum}} into a base class with two derived classes:
** {{GeoPointNumericTermsEnum}} - Uses existing {{NumericTokenStream}} logic
** {{GeoPointPrefixTermsEnum}} - Uses new {{GeoPointTokenStream}} logic
* Refactors term encoding logic out of {{GeoUtils}} into a dedicated 
{{GeoEncodingUtils}} class
* Adds {{randomTermEncoding}} to {{TestGeoPointQuery}} to randomly test both 
encoding approaches

Quick luceneutil benchmarks: 76% reduction in index size, 81% boost in indexing 
performance, 19% average boost in query performance.






> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
> Attachments: LUCENE-6930.patch
>
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] [Updated] (LUCENE-6930) Decouple GeoPointField from NumericType

2015-12-14 Thread Nicholas Knize (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-6930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nicholas Knize updated LUCENE-6930:
---
Description: 
{{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
terms for a GeoPoint using the precision step defined in {{GeoPointField}}. At 
search time {{GeoPointTermsEnum}} recurses to a max precision that is computed 
by the Query parameters. This max precision is never the full precision, so 
creating and indexing the full precision terms is useless and wasteful (it was 
always a side effect of just using indexing logic from the Numeric type). 

Furthermore, since the numerical logic always stored high precision terms 
first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} and 
reversing the term order (such that lower resolution terms are first), the 
GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
PrefixTerms. This will be done in a separate issue.

  was:
{{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
terms for a GeoPoint using the precision step defined in {{GeoPointField}}. At 
search time {{GeoPointTermsEnum}} recurses to a max precision that is computed 
by the Query parameters. This max precision is never the full precision, so 
creating and indexing the full precision terms is useless and wasteful (it was 
always a side effect of just using indexing logic from the Numeric type). 

Furthermore, since the numerical logic always stored high precision terms 
first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} and 
reversing the term order (such that lower resolution terms are first), the 
GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
PrefixTerms.


> Decouple GeoPointField from NumericType
> ---
>
> Key: LUCENE-6930
> URL: https://issues.apache.org/jira/browse/LUCENE-6930
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Nicholas Knize
>
> {{GeoPointField}} currently relies on {{NumericTokenStream}} to create prefix 
> terms for a GeoPoint using the precision step defined in {{GeoPointField}}. 
> At search time {{GeoPointTermsEnum}} recurses to a max precision that is 
> computed by the Query parameters. This max precision is never the full 
> precision, so creating and indexing the full precision terms is useless and 
> wasteful (it was always a side effect of just using indexing logic from the 
> Numeric type). 
> Furthermore, since the numerical logic always stored high precision terms 
> first, the recursion in {{GeoPointTermsEnum}} required transient memory for 
> storing ranges. By moving the trie logic to its own {{GeoPointTokenStream}} 
> and reversing the term order (such that lower resolution terms are first), 
> the GeoPointTermsEnum can naturally traverse, enabling on-demand creation of 
> PrefixTerms. This will be done in a separate issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org