Wow, thank you David!
You are really kind to spend your time writing all these informations to
me. This will be very helpful for my thesis work.

Thank you again.

2015-01-11 2:46 GMT+01:00 <

> Hello Matteo,
> Welcome. You are not bothering/me-us; you are asking in the right place.
> Jack’s right in terms of the field type dictating how it works.
> LatLonType, simply stores the latitude and longitude internally as
> separate floating point fields and it does efficient range queries over
> them for bounding-box queries.  Lucene has remarkably fast/efficient range
> queries over numbers based on a Trie/PrefixTree. In fact systems like
> TitanDB leave such queries to Lucene.  For point-radius, it iterates over
> all of them in-memory in a brute-force fashion (not scalable but may be
> fine).
> BBoxField is similar in spirit to LatLonType; each side of an indexed
> rectangle gets its own floating point field internally.
> Note that for both listed above, the underlying storage and range queries
> use built-in numeric fields.
> SpatialRecursivePrefixTreeFieldType (RPT for short) is interesting in that
> it supports indexing essentially any shape by representing the indexed
> shape as multiple grid squares.  Non-point shapes (e.g. a polygon) are
> approximated; if you need accuracy, you should additionally store the
> vector geometry and validate the results in a 2nd pass (see
> SerializedDVStrategy for help with that).  RPT, like Lucene’s numeric
> fields, uses a Trie/PrefixTree but encodes two dimensions, not one.
> The Trie/PrefixTree concept underlies both RPT and numeric fields, which
> are approaches to using Lucene’s terms index to encode prefixes.  So the
> big point here is that Lucene/Solr doesn’t have side indexes using
> fundamentally different technologies for different types of data; no;
> Lucene’s one versatile index looks up terms (for keyword search), numbers,
> AND 2-d spatial.  For keyword search, the term is a word, for numbers, the
> term represents a contiguous range of values (e.g. 100-200), and for 2-d
> spatial, a term is a grid square (a 2-D range).
> I am aware many other DBs put spatial data in R-Trees, and I have no
> interest investing energy in doing that in Lucene.  That isn’t to say I
> think that other DBs shouldn’t be using R-Trees.  I think a system based on
> sorted keys/terms (like Lucene and Cassandra, Accumulo, HBase, and others)
> already have a powerful/versatile index such that it doesn’t warrant
> complexity in adding something different.  And Lucene’s underlying index
> continues to improve.  I am most excited about an “auto-prefixing”
> technique McCandless has been working on that will bring performance up to
> the next level for numeric & spatial data in Lucene’s index.
> If you’d like to learn more about RPT and Lucene/Solr spatial, I suggest
> my “Spatial Deep Dive” presentation at Lucene Revolution in San Diego, May
> 2013:  Lucene / Solr 4 Spatial Deep Dive
> <>
> Also, my article here illustrates some RPT concepts in terms of indexing:
> ~ David Smiley
> Freelance Apache Lucene/Solr Search Consultant/Developer
> On Sat, Jan 10, 2015 at 10:26 AM, Matteo Tarantino <
>> wrote:
>> Hi all,
>> I hope to not bother you, but I think I'm writing to the only mailing
>> list that can help me with my question.
>> I am writing my master thesis about Geographical Information Retrieval
>> (GIR) and I'm using Solr to create a little geospatial search engine.
>> Reading  papers about GIR I noticed that these systems use a separate data
>> structure (like an R-tree to save
>> geographical coordinates of documents, but I have found nothing about how
>> Solr manages coordinates.
>> Can someone help me, and most of all, can someone address me to documents
>> that talk about how and where Solr saves spatial informations?
>> Thank you in advance
>> Matteo

Reply via email to