[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14577886#comment-14577886 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1684285 from [~mikemccand] in branch 'dev/trunk' [ https://svn.apache.org/r1684285 ] LUCENE-6481: add simple API to index geo lat/lon points and search by bounding box or polygon Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14577835#comment-14577835 ] Michael McCandless commented on LUCENE-6481: bq. For some reason a diff with the latest branch introduced a lot of duplicate changes Oh, sorry ... the branch was supposed to make things easier. But a trunk patch is fine too. I like testPacManPolyQuery! We can remove that NOTE in GeoPointField javadocs right? Else, this looks great, I'll commit soon! Thanks [~nknize]. Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14577906#comment-14577906 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1684286 from [~mikemccand] in branch 'dev/branches/branch_5x' [ https://svn.apache.org/r1684286 ] LUCENE-6481: add simple API to index geo lat/lon points and search by bounding box or polygon Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: New Feature Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14573429#comment-14573429 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1683615 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1683615 ] LUCENE-6481: merge trunk Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14564784#comment-14564784 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1682453 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1682453 ] LUCENE-6481: Nick's latest patch: create range terms once per query, not per segment; check cell intersection against polygon not its bbox for more restrictive recursing Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563508#comment-14563508 ] Nicholas Knize commented on LUCENE-6481: Note: this is a diff off the LUCENE-6481 branch. Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563507#comment-14563507 ] Nicholas Knize commented on LUCENE-6481: Note: this is a diff off the LUCENE-6481 branch. Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14563506#comment-14563506 ] Nicholas Knize commented on LUCENE-6481: Note: this is a diff off the LUCENE-6481 branch. Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560693#comment-14560693 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681942 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681942 ] LUCENE-6481: add nocommit Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560684#comment-14560684 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681941 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681941 ] LUCENE-6481: sometimes use smallish bbox for random test Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560690#comment-14560690 ] Michael McCandless commented on LUCENE-6481: I committed a small improvement to the test, to sometimes test on a smallish random sized bbox, and it hits this new failure: {noformat} [junit4] Started J0 PID(15031@localhost). [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery [junit4] 1 smallBBox=true [junit4] 1 T0: iter=1 id=3 docID=3 lat=-90.2957562284163 lon=-179.83102465432302 (bbox: lat=-90.84003988710097 TO -89.86198542441215 lon=-180.5339936365967 TO -179.31722107698667) expected true but got: false deleted?=false query=GeoPointInBBoxQuery: field=geoField: Lower Left: [-180.5339936365967,-90.84003988710097] Upper Right: [-179.31722107698667,-89.86198542441215] [junit4] 2 ??? 27, 2015 12:37:04 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException [junit4] 2 WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeoPointQuery] [junit4] 2 java.lang.AssertionError: wrong result [junit4] 2at __randomizedtesting.SeedInfo.seed([5]:0) [junit4] 2at org.junit.Assert.fail(Assert.java:93) [junit4] 2at org.apache.lucene.search.TestGeoPointQuery$1._run(TestGeoPointQuery.java:372) [junit4] 2at org.apache.lucene.search.TestGeoPointQuery$1.run(TestGeoPointQuery.java:271) [junit4] 2 [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testRandomTiny -Dtests.seed=5 -Dtests.locale=sr_ME -Dtests.timezone=Europe/Vilnius -Dtests.asserts=true -Dtests.file.encoding=US-ASCII [junit4] ERROR 0.09s | TestGeoPointQuery.testRandomTiny {noformat} I'm not sure why testRandomTiny is hitting all these problems! Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560702#comment-14560702 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681943 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681943 ] LUCENE-6481: woops Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560704#comment-14560704 ] Michael McCandless commented on LUCENE-6481: bq. I committed a small improvement to the test, to sometimes test on a smallish random sized bbox, and it hits this new failure: Woops, my bad: test bug. I committed a fix... Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14560985#comment-14560985 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1682029 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1682029 ] LUCENE-6481: validate lat/lon passed to the queries Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561056#comment-14561056 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1682043 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1682043 ] LUCENE-6481: add nocommit about query slowness for large bboxes; improve javadocs Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561044#comment-14561044 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1682036 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1682036 ] LUCENE-6481: restrict random test cases to 'smallish' bbox; switch static factory for polygon query to ctor Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14561052#comment-14561052 ] Michael McCandless commented on LUCENE-6481: I fixed the randomized test case to only test a smallish global bbox (up to +/- 1.5 in lat and lon directions)... when testing the full space the test runs very very slowly, even testRandomTiny, because it can require 100s of K terms ... I'm not quite sure why (?) but the query classes do advertise that they should be used on smallish rectangles. Also, the test fails because of boundary cases: {noformat} T0: iter=57 id=6819 docID=6723 lat=-81.12143028547105 lon=-168.98618510239544 (bbox: lat=-81.22948018485512 TO -80.9313958433031 lon=-168.98618505380117 TO -168.77958782241174) expected true but got: false deleted?=false query=GeoPointInBBoxQuery: field=geoField: Lower Left: [-168.98618505380117,-81.22948018485512] Upper Right: [-168.77958782241174,-80.9313958433031] máj 27, 2015 2:16:52 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeoPointQuery] java.lang.AssertionError: wrong result at __randomizedtesting.SeedInfo.seed([4B5245DED09AE592]:0) at org.junit.Assert.fail(Assert.java:93) at org.apache.lucene.search.TestGeoPointQuery$1._run(TestGeoPointQuery.java:380) at org.apache.lucene.search.TestGeoPointQuery$1.run(TestGeoPointQuery.java:279) {noformat} It's odd because in GeoUtils#bboxContains, we do try to take the quantization into account ... Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559294#comment-14559294 ] Michael McCandless commented on LUCENE-6481: I was curious about the polygon query performance, so I tweaked the bboxes around London, UK benchmark to just use a polygon query instead (with 5 points, first and last are the same to the polygon is close), and surprisingly the performance is only a bit slower than the bbox case (19.1 msec vs 17.8 msec) ... I expected much slower because in the polygon case we cannot use the prefix terms, I think? Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559808#comment-14559808 ] Michael McCandless commented on LUCENE-6481: I made test a bit more evil (use threads, sometimes randomly do poly query), but I hit this: {noformat} [junit4] Started J0 PID(12760@localhost). [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery [junit4] OK 0.04s | TestGeoPointQuery.testBBoxQuery [junit4] 2 de maig 26, 2015 3:30:16 PM com.carrotsearch.randomizedtesting.RandomizedRunner$QueueUncaughtExceptionsHandler uncaughtException [junit4] 2 WARNING: Uncaught exception in thread: Thread[T0,5,TGRP-TestGeoPointQuery] [junit4] 2 java.lang.OutOfMemoryError: Java heap space [junit4] 2at __randomizedtesting.SeedInfo.seed([E57ABD1CCC7F4388]:0) [junit4] 2at java.util.LinkedList.toArray(LinkedList.java:1050) [junit4] 2at java.util.List.sort(List.java:477) [junit4] 2at java.util.Collections.sort(Collections.java:141) [junit4] 2at org.apache.lucene.search.GeoPointInBBoxQuery$GeoBBoxTermsEnum.init(GeoPointInBBoxQuery.java:157) [junit4] 2at org.apache.lucene.search.GeoPointInBBoxQuery.getTermsEnum(GeoPointInBBoxQuery.java:86) [junit4] 2at org.apache.lucene.search.MultiTermQuery.getTermsEnum(MultiTermQuery.java:345) [junit4] 2at org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.rewrite(MultiTermQueryConstantScoreWrapper.java:146) [junit4] 2at org.apache.lucene.search.MultiTermQueryConstantScoreWrapper$1.scorer(MultiTermQueryConstantScoreWrapper.java:212) [junit4] 2at org.apache.lucene.search.LRUQueryCache$CachingWrapperWeight.scorer(LRUQueryCache.java:585) [junit4] 2at org.apache.lucene.search.Weight.bulkScorer(Weight.java:137) [junit4] 2at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:72) [junit4] 2at org.apache.lucene.search.AssertingWeight.bulkScorer(AssertingWeight.java:72) [junit4] 2at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:560) [junit4] 2at org.apache.lucene.search.AssertingIndexSearcher.search(AssertingIndexSearcher.java:93) [junit4] 2at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:367) [junit4] 2at org.apache.lucene.search.TestGeoPointQuery$1._run(TestGeoPointQuery.java:337) [junit4] 2at org.apache.lucene.search.TestGeoPointQuery$1.run(TestGeoPointQuery.java:265) [junit4] 2 [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testRandomTiny -Dtests.seed=E57ABD1CCC7F4388 -Dtests.locale=ca_ES -Dtests.timezone=America/Rankin_Inlet -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] ERROR 49.3s | TestGeoPointQuery.testRandomTiny {noformat} It's curious because the {{testRandomTiny}} only indexes a few 10s of points... Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559762#comment-14559762 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681836 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681836 ] LUCENE-6481: fix some ant precommit issues; improve test (use threads; randomly test poly query too), but currently disabled; add validation of the incoming lat/lon to poly query; improve query toStrings a bit Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559786#comment-14559786 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681842 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681842 ] LUCENE-6481: make explicit separate methods for the lat vs lon cases Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559805#comment-14559805 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681849 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681849 ] LUCENE-6481: enable threads and random poly query in test Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559820#comment-14559820 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681853 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681853 ] LUCENE-6481: add back lost GeoPointInPolygonQuery.isWithin override Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559710#comment-14559710 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681827 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681827 ] LUCENE-6481: woops Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559722#comment-14559722 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681828 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681828 ] LUCENE-6481: fix some precommit issues Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559677#comment-14559677 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681820 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681820 ] LUCENE-6481: make branch Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559685#comment-14559685 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681821 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681821 ] LUCENE-6481: commit latest patch Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559687#comment-14559687 ] Michael McCandless commented on LUCENE-6481: I made a branch so we can iterate a bit: https://svn.apache.org/repos/asf/lucene/dev/branches/LUCENE-6481 I'll make some small fixes to satisfy ant precommit, and fold some of the improvements to the random test from LUCENE-6477 Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559827#comment-14559827 ] ASF subversion and git services commented on LUCENE-6481: - Commit 1681855 from [~mikemccand] in branch 'dev/branches/LUCENE-6481' [ https://svn.apache.org/r1681855 ] LUCENE-6481: go back to 1 thread, no poly query, until we track down testRandomTiny failures Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14559269#comment-14559269 ] Michael McCandless commented on LUCENE-6481: Thanks [~nknize], tests pass for me now, and I ran the same bboxes around London, UK perf test and this is much faster than before (= LUCENE-6450): I now see ~17.8 msec avg per query (~2X faster than GeoHashPrefixTree I think). I'll have a closer look at the patch ... I really want to make a visualization here just like the BKD tree ones (https://plus.google.com/+MichaelMcCandless/posts/Daj9FgYPdtv) to see how the morton-shapes work in filling in a polygon... I think there are fun things we can explore (future!) to speed things up further, e.g. if we also index lat/lon into doc values, then we can use that for filtering, freeing us to also use prefix terms on boundary shapes, and also maybe freeing us to use encodings like Hilbert curves which should give better locality / able to use prefix terms more frequently since you would no longer need a fast decode from term - lat/lon. But, it would use more disk space... we can also integrate with geo3d so we get shape intersection for faster polygon querying (now it must filter every point?). Later! Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557255#comment-14557255 ] Michael McCandless commented on LUCENE-6481: Ooh, awesome progress! I'll look in more detail, but: I applied the patch and hit this test failure: {noformat} ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testBBoxQuery -Dtests.seed=FC153607634373BA -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=es -Dtests.timezone=Asia/Jerusalem -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] Started J0 PID(14260@localhost). [junit4] Suite: org.apache.lucene.search.TestGeoPointQuery [junit4] 2 NOTE: reproduce with: ant test -Dtestcase=TestGeoPointQuery -Dtests.method=testBBoxQuery -Dtests.seed=FC153607634373BA -Dtests.nightly=true -Dtests.slow=true -Dtests.linedocsfile=/lucenedata/hudson.enwiki.random.lines.txt.fixed -Dtests.locale=es -Dtests.timezone=Asia/Jerusalem -Dtests.asserts=true -Dtests.file.encoding=UTF-8 [junit4] FAILURE 0.09s | TestGeoPointQuery.testBBoxQuery [junit4] Throwable #1: java.lang.AssertionError: GeoBoundingBoxQuery failed expected:1 but was:2 [junit4]at __randomizedtesting.SeedInfo.seed([FC153607634373BA:5CE7EC6FB0DC9159]:0) [junit4]at org.apache.lucene.search.TestGeoPointQuery.testBBoxQuery(TestGeoPointQuery.java:113) [junit4]at java.lang.Thread.run(Thread.java:745) [junit4] 2 NOTE: leaving temporary files on disk at: /l/nick/lucene/build/sandbox/test/J0/temp/lucene.search.TestGeoPointQuery FC153607634373BA-001 [junit4] 2 NOTE: test params are: codec=Asserting(Lucene50): {geoField=PostingsFormat(name=MockRandom)}, docValues:{}, sim=DefaultSimilarity, locale=es, timezone=Asia/Jerusalem [junit4] 2 NOTE: Linux 3.13.0-46-generic amd64/Oracle Corporation 1.8.0_40 (64-bit)/cpus=8,threads=1,free=385990872,total=504889344 [junit4] 2 NOTE: All tests run in this JVM: [TestGeoPointQuery] [junit4] Completed [1/1] in 0.48s, 1 test, 1 failure FAILURES! {noformat} Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14556943#comment-14556943 ] Nicholas Knize commented on LUCENE-6481: This patch is ready to go. In fact, this performance improvement should supersede the existing patch in [LUCENE-6450|https://issues.apache.org/jira/browse/LUCENE-6450]. Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-6481) Improve GeoPointField type to only visit high precision boundary terms
[ https://issues.apache.org/jira/browse/LUCENE-6481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14543767#comment-14543767 ] Michael McCandless commented on LUCENE-6481: bq. The test had the lat and lon ordering incorrect for both GeoPointFieldType and the GeoPointInBBoxQuery. Oh, woops :) Thanks for fixing1 Improve GeoPointField type to only visit high precision boundary terms --- Key: LUCENE-6481 URL: https://issues.apache.org/jira/browse/LUCENE-6481 Project: Lucene - Core Issue Type: Improvement Components: core/index Reporter: Nicholas Knize Attachments: LUCENE-6481.patch, LUCENE-6481.patch, LUCENE-6481_WIP.patch Current GeoPointField [LUCENE-6450 | https://issues.apache.org/jira/browse/LUCENE-6450] computes a set of ranges along the space-filling curve that represent a provided bounding box. This determines which terms to visit in the terms dictionary and which to skip. This is suboptimal for large bounding boxes as we may end up visiting all terms (which could be quite large). This incremental improvement is to improve GeoPointField to only visit high precision terms in boundary ranges and use the postings list for ranges that are completely within the target bounding box. A separate improvement is to switch over to auto-prefix and build an Automaton representing the bounding box. That can be tracked in a separate issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org