[jira] [Commented] (LUCENE-8615) Can LatLonShape's tessellator create more search-efficient triangles?

2020-01-17 Thread Adrien Grand (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017875#comment-17017875
 ] 

Adrien Grand commented on LUCENE-8615:
--

This sounds like an interesting idea!

> Can LatLonShape's tessellator create more search-efficient triangles?
> -
>
> Key: LUCENE-8615
> URL: https://issues.apache.org/jira/browse/LUCENE-8615
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: 2-tessellations.png, re-tessellate-triangle.png, 
> screenshot-1.png
>
>
> The triangular mesh produced by LatLonShape's Tessellator creates reasonable 
> numbers of triangles, which is helpful for indexing speed. However I'm 
> wondering that there are conditions when it might be beneficial to run 
> tessellation slightly differently in order to create triangles that are more 
> search-friendly. Given that we only index the minimum bounding rectangle for 
> each triangle, we always check for intersection between the query and the 
> triangle if the query intersects with the MBR of the triangle. So the smaller 
> the area of the triangle compared to its MBR, the higher the likeliness to 
> have false positive when querying.
> For instance see the following shape, there are two ways that it can be 
> tessellated into two triangles. LatLonShape's Tessellator is going to return 
> either of them depending on which point is listed first in the polygon. Yet 
> the first one is more efficient that the second one: with the second one, 
> both triangles have roughly the same MBR (which is also the MBR of the 
> polygon), so both triangles will need to be checked all the time whenever the 
> query intersects with this shared MBR. On the other hand, with the first way, 
> both MBRs are smaller and don't overlap, which makes it more likely that only 
> one triangle needs to be checked at query time.
>  !2-tessellations.png! 
> Another example is the following polygon. It can be tessellated into a single 
> triangle. Yet at times it might be a better idea create more triangles so 
> that the overall area of MBRs is smaller and queries are less likely to run 
> into false positives.
>  !re-tessellate-triangle.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org



[jira] [Commented] (LUCENE-8615) Can LatLonShape's tessellator create more search-efficient triangles?

2020-01-14 Thread Ignacio Vera (Jira)


[ 
https://issues.apache.org/jira/browse/LUCENE-8615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17014971#comment-17014971
 ] 

Ignacio Vera commented on LUCENE-8615:
--

Here is an idea. We currently have the following type of triangles in the index:

!screenshot-1.png|width=432,height=464!

This plot shows that the potentially more wasteful triangles are the ones where 
two of the points belongs to bounding box (first four possibilities). So I 
wonder we should prevent to add those types of triangles to the index and 
instead split them using the longest side.

Note that the side effect is that we can reduce the number of triangle types to 
4.

> Can LatLonShape's tessellator create more search-efficient triangles?
> -
>
> Key: LUCENE-8615
> URL: https://issues.apache.org/jira/browse/LUCENE-8615
> Project: Lucene - Core
>  Issue Type: Improvement
>Reporter: Adrien Grand
>Priority: Minor
> Attachments: 2-tessellations.png, re-tessellate-triangle.png, 
> screenshot-1.png
>
>
> The triangular mesh produced by LatLonShape's Tessellator creates reasonable 
> numbers of triangles, which is helpful for indexing speed. However I'm 
> wondering that there are conditions when it might be beneficial to run 
> tessellation slightly differently in order to create triangles that are more 
> search-friendly. Given that we only index the minimum bounding rectangle for 
> each triangle, we always check for intersection between the query and the 
> triangle if the query intersects with the MBR of the triangle. So the smaller 
> the area of the triangle compared to its MBR, the higher the likeliness to 
> have false positive when querying.
> For instance see the following shape, there are two ways that it can be 
> tessellated into two triangles. LatLonShape's Tessellator is going to return 
> either of them depending on which point is listed first in the polygon. Yet 
> the first one is more efficient that the second one: with the second one, 
> both triangles have roughly the same MBR (which is also the MBR of the 
> polygon), so both triangles will need to be checked all the time whenever the 
> query intersects with this shared MBR. On the other hand, with the first way, 
> both MBRs are smaller and don't overlap, which makes it more likely that only 
> one triangle needs to be checked at query time.
>  !2-tessellations.png! 
> Another example is the following polygon. It can be tessellated into a single 
> triangle. Yet at times it might be a better idea create more triangles so 
> that the overall area of MBRs is smaller and queries are less likely to run 
> into false positives.
>  !re-tessellate-triangle.png! 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org