The "Usability" of an index for a certain set of constraints ends with the 
first (in the order mentioned in the index) field with an inequality constraint.

However, and index on (gx,gy) is still be faster for the query shown than just 
on (gx), because the value of the second field is read from the index instead 
of having to retrieve it from the row.

If your application selects only a small subset of fields for the majority of 
queries, it may even be useful to include these field in the index. This is 
called a "covering index". It costs more to maintain, but is faster for queries 
that read only the fields present in (covered by) the index.

-----Urspr?ngliche Nachricht-----
Von: sqlite-users-bounces at mailinglists.sqlite.org 
[mailto:sqlite-users-bounces at mailinglists.sqlite.org] Im Auftrag von 
Wolfgang Enzinger
Gesendet: Mittwoch, 20. April 2016 15:16
An: sqlite-users at mailinglists.sqlite.org
Betreff: [sqlite] storing coordinate pairs efficiently?

Dear group,

since a couple of years now, I'm using SQLite for storing graphic objects like 
polygons and lines, with great success. Using the rtree extension makes spatial 
queries very efficient.

In my lastest project, however, these objects are just single points 
(coordinate pairs). As I understand it, rtree isn't the appropriate tool here 
because it's for range queries. Of course, I could store every coordinate twice 
in an rtee index, thus simulating ranges (with a width of zero). However, I 
guess that would be a waste of space with probably no benefit.

So I tried a few things. The table definition looks like this:

CREATE TABLE "pdata"(
  "gx" DOUBLE NOT NULL,     -- x coordinate
  "gy" DOUBLE NOT NULL,     -- y coordinate
  "ID" INTEGER PRIMARY KEY NOT NULL
  /* a couple more fields here */
)

Then I established an index over gx and gy:

CREATE INDEX "pdata_spatial_index" ON "pdata"("gx","gy");

EQP shows that this index is used (not surprisingly) for every spatial query
like:

SELECT * FROM pdata WHERE gx BETWEEN 210000 AND 220000 AND gy BETWEEN 200000 
AND 390000

However, EQP also shows that it is only used with the value of gx, not gy:

SEARCH TABLE pdata USING INDEX pdata_spatial_index (gx>? AND gx<?)

Since storing "gy" in this index again seems just a waste of space, I left it 
off. OTOH, it doesn't feel right to have no index at all on "gy", so this 
fields got its own index:

CREATE INDEX "pdata_spatial_index_x" ON "pdata"("gx"); CREATE INDEX 
"pdata_spatial_index_y" ON "pdata"("gy");

With ANALYZE data available, SQLite is now smart enough to choose the most 
efficient search path, depending on the width / height of the search range:

SELECT * FROM pdata WHERE gx BETWEEN 210000 AND 220000 and gy BETWEEN 200000 
AND 390000
--> SEARCH TABLE pdata USING INDEX pdata_spatial_index_x (gx>? AND gx<?)

SELECT * FROM pdata WHERE gx BETWEEN 200000 AND 300000 and gy BETWEEN 210000 
AND 220000
--> SEARCH TABLE pdata USING INDEX pdata_spatial_index_y (gy>? AND gy<?)

So far this seems the most efficient solution to me. Any input on this?

Wolfgang

_______________________________________________
sqlite-users mailing list
sqlite-users at mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users


___________________________________________
 Gunter Hick
Software Engineer
Scientific Games International GmbH
FN 157284 a, HG Wien
Klitschgasse 2-4, A-1130 Vienna, Austria
Tel: +43 1 80100 0
E-Mail: hick at scigames.at

This communication (including any attachments) is intended for the use of the 
intended recipient(s) only and may contain information that is confidential, 
privileged or legally protected. Any unauthorized use or dissemination of this 
communication is strictly prohibited. If you have received this communication 
in error, please immediately notify the sender by return e-mail message and 
delete all copies of the original communication. Thank you for your cooperation.


Reply via email to