On Tue, Dec 29, 2009 at 8:41 PM, patrick o'leary <pj...@pjaol.com> wrote:
> Afraid I just took a sample set of data that was available to me at my last > job, and ran the test. > It kind of matched my expectations in terms of locallucene at the time, and > what Ure predicted for Trie. > Do you still think there would be such a drastic difference in a lower density situation? > > To give you an idea of it's performance in production, the bounding box > retrieval for a single solr core of about 3million docs > on a dual core 2.3ghz server with I think 8gb of ram, was about 8 - 12ms > avg. And had ~ 3,000 results per result set. > > The slow part for geo search was always the distance calculation not the > bounding box retrieval. > I've seen feedback of where hilbert curve is meant to be faster again by an > average of 40%, so say 4-6 ms for bounding box retrieval. > Yeah I have looked into hilbert curve a little myself. Do you think its an approach worth investigating? or will it add more complexity? > But that still doesn't solve the long haul of distance calculations, which > has been one of my focuses recently with a new projection and > distance calculation based up that projection. > > Tell us more! Yeah I also ran into the cost of the distance calculations, which is why I went down the road of doing the calculations in parallel, and addressing the cost of actual calculations themselves. This has been pretty effective, but I am very interested in this new projection idea? > > > On Tue, Dec 29, 2009 at 11:31 AM, Chris Male <gento...@gmail.com> wrote: > > > Hi, > > > > I had never done any experiments comparing them, that was what I was > hoping > > was going to be explored more and it seems you have done that. Do you > have > > more statistics by chance? Does the difference (which is pretty > dramatic) > > stay a constant ratio as you change the density and/or distances? > > > > On Tue, Dec 29, 2009 at 8:25 PM, patrick o'leary <pj...@pjaol.com> > wrote: > > > > > Hmm, so it's faster to do 2 range searches than use the TermEnumerator > to > > > find maybe 4-6 individual CartesianTier id's? > > > > > > I had similar approaches in the past like 2 years ago, that just > weren't > > > fast enough, and I've even published comparisons with Trie data types, > > and > > > find CartesianTier id's > > > > > > > > > https://issues.apache.org/jira/browse/SOLR-773?focusedCommentId=12708605&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12708605 > > > > > > The speed of Trie match what Ure's expectations were about 100ms, but > > > Cartesian is just 12ms. > > > > > > The custom code, well you'd have to have custom code to figure out the > > > bounding box from a point, unless you want to user to figure that out? > > > And the Cartesian stuff is pretty small, it's underlying structure can > / > > > and > > > now does use Trie (simply because it's the only numeric field cache > > > interface common between lucene and solr). > > > > > > P > > > > > > > > > On Tue, Dec 29, 2009 at 11:11 AM, Chris Male (JIRA) <j...@apache.org> > > > wrote: > > > > > > > > > > > [ > > > > > > > > > > https://issues.apache.org/jira/browse/SOLR-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12795112#action_12795112 > > > ] > > > > > > > > Chris Male commented on SOLR-1586: > > > > ---------------------------------- > > > > > > > > Ah yes sorry TrieFields. I don't see searching 2 fields as a > downside > > > > since that's just an implementation detail like the Spatial Tile > (which > > > > requires you to have upto 15 fields). Assuming you can use the Point > > > > FieldType to index an x and y field, then it just becomes another > > option > > > > like Spatial Tile. The fact they are supported out of box is part of > > the > > > > attraction, as it would reduce how much custom code has to be > > maintained. > > > > > > > > > Create Spatial Point FieldTypes > > > > > ------------------------------- > > > > > > > > > > Key: SOLR-1586 > > > > > URL: > https://issues.apache.org/jira/browse/SOLR-1586 > > > > > Project: Solr > > > > > Issue Type: Improvement > > > > > Reporter: Grant Ingersoll > > > > > Assignee: Grant Ingersoll > > > > > Priority: Minor > > > > > Fix For: 1.5 > > > > > > > > > > Attachments: examplegeopointdoc.patch.txt, > > > > SOLR-1586-geohash.patch, > > > SOLR-1586.Mattmann.112209.geopointonly.patch.txt, > > > > SOLR-1586.Mattmann.112209.geopointonly.patch.txt, > > > > SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, > > > > SOLR-1586.Mattmann.112409.geopointandgeohash.patch.txt, > > > > SOLR-1586.Mattmann.112509.geopointandgeohash.patch.txt, > > > > SOLR-1586.Mattmann.120709.geohashonly.patch.txt, > > > > SOLR-1586.Mattmann.121209.geohash.outarr.patch.txt, > > > > SOLR-1586.Mattmann.121209.geohash.outstr.patch.txt, > > > > SOLR-1586.Mattmann.122609.patch.txt, SOLR-1586.patch, SOLR-1586.patch > > > > > > > > > > > > > > > Per SOLR-773, create field types that hid the details of creating > > > tiers, > > > > geohash and lat/lon fields. > > > > > Fields should take in lat/lon points in a single form, as in: > > > > > <field name="foo">lat lon</field> > > > > > > > > -- > > > > This message is automatically generated by JIRA. > > > > - > > > > You can reply to this email to add a comment to the issue online. > > > > > > > > > > > > > > > > > > > -- > > Chris Male | Software Developer | JTeam BV.| www.jteam.nl > > > -- Chris Male | Software Developer | JTeam BV.| www.jteam.nl