[
https://issues.apache.org/jira/browse/LUCENE-5779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Smiley updated LUCENE-5779:
---------------------------------
Attachment: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch
The attached patch is a partial patch from LUCENE-5714 including just the
AreaSimilarity class, and the new test for BBoxStrategy which includes the test
for this new similarity showing examples scores. Developing it surfaced a
variety of dateline related bugs when computing intersection width & height.
> Improve BBox AreaSimilarity algorithm to consider lines and points
> ------------------------------------------------------------------
>
> Key: LUCENE-5779
> URL: https://issues.apache.org/jira/browse/LUCENE-5779
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/spatial
> Reporter: David Smiley
> Attachments: LUCENE-5779__Improved_bbox_AreaSimilarity_algorithm.patch
>
>
> GeoPortal's area overlap algorithm didn't consider lines and points; they end
> up turning the score 0. I've thought about this for a bit and I've come up
> with an alternative scoring algorithm. (already coded and tested and
> documented):
> New Javadocs:
> {code:java}
> /**
> * The algorithm is implemented as envelope on envelope overlays rather than
> * complex polygon on complex polygon overlays.
> * <p/>
> * <p/>
> * Spatial relevance scoring algorithm:
> * <DL>
> * <DT>queryArea</DT> <DD>the area of the input query envelope</DD>
> * <DT>targetArea</DT> <DD>the area of the target envelope (per Lucene
> document)</DD>
> * <DT>intersectionArea</DT> <DD>the area of the intersection between the
> query and target envelopes</DD>
> * <DT>queryTargetProportion</DT> <DD>A 0-1 factor that divides the score
> proportion between query and target.
> * 0.5 is evenly.</DD>
> *
> * <DT>queryRatio</DT> <DD>intersectionArea / queryArea; (see note)</DD>
> * <DT>targetRatio</DT> <DD>intersectionArea / targetArea; (see note)</DD>
> * <DT>queryFactor</DT> <DD>queryRatio * queryTargetProportion;</DD>
> * <DT>targetFactor</DT> <DD>targetRatio * (1 - queryTargetProportion);</DD>
> * <DT>score</DT> <DD>queryFactor + targetFactor;</DD>
> * </DL>
> * Note: The actual computation of queryRatio and targetRatio is more
> complicated so that it considers
> * points and lines. Lines have the ratio of overlap, and points are either
> 1.0 or 0.0 depending on wether
> * it intersects or not.
> * <p />
> * Based on Geoportal's
> * <a
> href="http://geoportal.svn.sourceforge.net/svnroot/geoportal/Geoportal/trunk/src/com/esri/gpt/catalog/lucene/SpatialRankingValueSource.java">
> * SpatialRankingValueSource</a> but modified. GeoPortal's algorithm will
> yield a score of 0
> * if either a line or point is compared, and it's doesn't output a 0-1
> normalized score (it multiplies the factors).
> *
> * @lucene.experimental
> */
> {code}
--
This message was sent by Atlassian JIRA
(v6.2#6252)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]