[ 
https://issues.apache.org/jira/browse/LUCENE-5408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Smiley updated LUCENE-5408:
---------------------------------

    Description: 
I've started work on a new SpatialStrategy implementation I'm tentatively 
calling SerializedDVStrategy.  It's similar to the [JtsGeoStrategy in 
Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts]
 but a little different in the details -- certainly faster.  Using Spatial4j 
0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in 
internally WKB format) and the strategy will put it in a BinaryDocValuesField.  
In practice the shape is likely a polygon but it needn't be.  Then I'll 
implement a Filter that returns a DocIdSetIterator that evaluates a given 
document passed via advance(docid)) to see if the query shape matches a shape 
in DocValues. It's improper usage for it to be used in a situation where it 
will evaluate every document id via nextDoc().  And in practice the DocValues 
format chosen should be a disk resident one since each value tends to be kind 
of big.

This spatial strategy in and of itself has no _index_; it's O(N) where N is the 
number of documents that get passed thru it.  So it should be placed last in 
the query/filter tree so that the other queries limit the documents it needs to 
see.  At a minimum, another query/filter to use in conjunction is another 
SpatialStrategy like RecursivePrefixTreeStrategy.

Eventually once the PrefixTree grid encoding has a little bit more metadata, it 
will be possible to further combine the grid & this strategy in such a way that 
many documents won't need to be checked against the serialized geometry.

  was:
I've started work on a new SpatialStrategy implementation I'm tentatively 
calling GeometryStrategy.  It's similar to the [JtsGeoStrategy in 
Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts]
 but a little different in the details -- certainly faster.  Using Spatial4j 
0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in 
internally WKB format) and the strategy will put it in a BinaryDocValuesField.  
In practice the shape is likely a polygon but it needn't be.  Then I'll 
implement a Filter that returns a DocIdSetIterator that evaluates a given 
document passed via advance(docid)) to see if the query shape matches a shape 
in DocValues. It's improper usage for it to be used in a situation where it 
will evaluate every document id via nextDoc().  And in practice the DocValues 
format chosen should be a disk resident one since each value tends to be kind 
of big.

This spatial strategy in and of itself has no _index_; it's O(N) where N is the 
number of documents that get passed thru it.  So it should be placed last in 
the query/filter tree so that the other queries limit the documents it needs to 
see.  At a minimum, another query/filter to use in conjunction is another 
SpatialStrategy like RecursivePrefixTreeStrategy.

Eventually once the PrefixTree grid encoding has a little bit more metadata, it 
will be possible to further combine the grid & this strategy in such a way that 
many documents won't need to be checked against the serialized geometry.


> SerializedDVStrategy -- match geometries in DocValues
> -----------------------------------------------------
>
>                 Key: LUCENE-5408
>                 URL: https://issues.apache.org/jira/browse/LUCENE-5408
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spatial
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 4.7
>
>         Attachments: LUCENE-5408_GeometryStrategy.patch, 
> LUCENE-5408_SerializedDVStrategy.patch
>
>
> I've started work on a new SpatialStrategy implementation I'm tentatively 
> calling SerializedDVStrategy.  It's similar to the [JtsGeoStrategy in 
> Spatial-Solr-Sandbox|https://github.com/ryantxu/spatial-solr-sandbox/tree/master/LSE/src/main/java/org/apache/lucene/spatial/pending/jts]
>  but a little different in the details -- certainly faster.  Using Spatial4j 
> 0.4's BinaryCodec, it'll serialize the shape to bytes (for polygons this in 
> internally WKB format) and the strategy will put it in a 
> BinaryDocValuesField.  In practice the shape is likely a polygon but it 
> needn't be.  Then I'll implement a Filter that returns a DocIdSetIterator 
> that evaluates a given document passed via advance(docid)) to see if the 
> query shape matches a shape in DocValues. It's improper usage for it to be 
> used in a situation where it will evaluate every document id via nextDoc().  
> And in practice the DocValues format chosen should be a disk resident one 
> since each value tends to be kind of big.
> This spatial strategy in and of itself has no _index_; it's O(N) where N is 
> the number of documents that get passed thru it.  So it should be placed last 
> in the query/filter tree so that the other queries limit the documents it 
> needs to see.  At a minimum, another query/filter to use in conjunction is 
> another SpatialStrategy like RecursivePrefixTreeStrategy.
> Eventually once the PrefixTree grid encoding has a little bit more metadata, 
> it will be possible to further combine the grid & this strategy in such a way 
> that many documents won't need to be checked against the serialized geometry.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to