"Simon Slavin" wrote ... > > On 21 Aug 2010, at 3:29am, Igor Tandetnik wrote: > >> ve3meo <holden_fam...@sympatico.ca> wrote: >>> If the number of VM instructions is a good inverse indicator of >>> efficiency, >>> then the second query with the index appears to be the most efficient. >>> I'm >>> sure somebody will point out if it is not. >> >> It is not. It may very well take fewer instructions to implement a >> straightforward linear scan than it is to join through the index. > > Yeah. I think your formulation like this: > >> 41 - Jim's query revamped to work: >> select * from A as A1, A as A2 where A1.name=A2.name and >> A1.position != A2.position and >> A2.position between A1.position - 10 and A1.position + 10 ; > > is going to be the fastest. It involves more VM instructions but ends up > searching fewer points. You might want to time it as it is, then add an > index (just for testing) on just (position) and see whether that improves > matters.
Thanks for the explanation. I thought I might be naively wishing for too much from the count of VM instructions when some may take much longer to execute than others. I was intrigued by Peng's question and wondered if I might be able to predict efficiency by examining Explain Query Plan and Explain. Apparently not. I understand the principle of minimising the number of points to search and I think that this argues for the WHERE conditions to have the coarse resolution first and the finest resolution last. To pick up on Simon's suggestion to chunk, I added an integer field 'chunk' to the table: UPDATE A SET chunk = (position/10); I indexed a bunch of ways to see what the optimiser would do: create index idxNameChunkPos ON A(name,chunk,position); create index idxNamePos ON A(name,position); create index idxNameChunk ON A(name,chunk); create index idxName ON A(name); create index idxPos ON A(position); create index idxChunk ON A(chunk); Here's how I have interpreted Simon's suggested chunky query: select * from A as A1, A as A2 where A1.name=A2.name and A1.position != A2.position and A2.chunk between A1.chunk - 1 and A1.chunk + 1 and A2.position between A1.position - 10 and A1.position + 10 ; Explain Query Plan shows that this query uses table A2 with idxNameChunkPos. Is this liable to result in a minimum of points searched? But so do Peng's original and re-ordered query use idxNameChunkPos even though Chunk is not in the query. Despite the presence of all these other indexes, Jim's suggested query persists in using idxNamePos, as one would expect it to be the optimal index for that query and for the original and re-ordered. I'm wary of what the optimiser does. On those first two queries, I have seen that it switches from using idxNamePos to idxName after the addition of the Chunk field when only those two indexes were present. I should add that the SQLite manager through which I play with SQLite uses a 3.7.0 DLL. Tom _______________________________________________ sqlite-users mailing list sqlite-users@sqlite.org http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users