[sqlite] index usage and boolean constants

2015-10-28 Thread Marcus Rohrmoser
Hi fellow-sqliters,

please forgive  my newbie question, I hope it wasn't asked similar recently. I 
didn't find a way to properly search the list archives to check.

CREATE TABLE demo (
  id INTEGER PRIMARY KEY
);
EXPLAIN QUERY PLAN SELECT * FROM demo WHERE 0 OR id=3;
EXPLAIN QUERY PLAN SELECT * FROM demo WHERE id=3;

doesn't use the index in case 1, only in the 2nd case.

Is there a way to hint the query analyzer to use the index? What other tricks 
come into your mind?

Cheers,
Marcus

P.S.: my complete use case is this 
https://github.com/mro/librdf.sqlite/issues/11


[sqlite] index usage and boolean constants

2015-10-28 Thread Richard Hipp
On 10/28/15, Marcus Rohrmoser  wrote:
>
> CREATE TABLE demo (
>   id INTEGER PRIMARY KEY
> );
> EXPLAIN QUERY PLAN SELECT * FROM demo WHERE 0 OR id=3;
> EXPLAIN QUERY PLAN SELECT * FROM demo WHERE id=3;
>
> doesn't use the index in case 1, only in the 2nd case.
>
> Is there a way to hint the query analyzer to use the index? What other
> tricks come into your mind?
>

The query planner in SQLite does not (currently) recognize the special
case of "0 OR ...".  Hence, the only way to get it to use an index is
the second form of the query.
-- 
D. Richard Hipp
drh at sqlite.org


Re: [sqlite] Index usage for order by and where clauses

2014-10-29 Thread Keith Medcalf

On Wednesday, 29 October, 2014 07:47, Clemens Ladisch said:
>Baruch Burstein wrote:
>> If I have an index on table1(colA, colB), will it be used for both the
>> where and the order by in either of these cases:

>> select * from table1 where colA=1 order by colB;
>> select * from table1 where colB=1 order by colA;

>$ sqlite3
>sqlite> create table table1(colA, colB, [...]);
>sqlite> create index i on table1(colA, colB);
>sqlite> explain query plan select * from table1 where colA=1 order by
>colB;
>0|0|0|SEARCH TABLE table1 USING INDEX i (colA=?)
>sqlite> explain query plan select * from table1 where colB=1 order by
>colA;
>0|0|0|SCAN TABLE table1 USING INDEX i

>In the second query, the database reads colB1 from the index, but
>this happens for _all_ rows, so the WHERE is not sped up.

That depends on your definition of "sped up", the hit rate of the where clause, 
and the size of the rows in the underlying table.  Since the index contains 
only a couple of bytes of data it can most likely be scanned far faster than 
the underlying table rows, and the result set does not need to be sorted before 
being returned.  For five rows it will make no difference.  For thousands or 
millions of rows the difference will be significant.




___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage for order by and where clauses

2014-10-29 Thread Clemens Ladisch
Baruch Burstein wrote:
> If I have an index on table1(colA, colB), will it be used for both the
> where and the order by in either of these cases:
>
> select * from table1 where colA=1 order by colB;
> select * from table1 where colB=1 order by colA;

$ sqlite3
sqlite> create table table1(colA, colB, [...]);
sqlite> create index i on table1(colA, colB);
sqlite> explain query plan select * from table1 where colA=1 order by colB;
0|0|0|SEARCH TABLE table1 USING INDEX i (colA=?)
sqlite> explain query plan select * from table1 where colB=1 order by colA;
0|0|0|SCAN TABLE table1 USING INDEX i

In the second query, the database reads colB1 from the index, but
this happens for _all_ rows, so the WHERE is not sped up.


Regards,
Clemens
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Index usage for order by and where clauses

2014-10-29 Thread Baruch Burstein
Hi,

If I have an index on table1(colA, colB), will it be used for both the
where and the order by in either of these cases:

select * from table1 where colA=1 order by colB;
select * from table1 where colB=1 order by colA;

-- 
˙uʍop-ǝpısdn sı ɹoʇıuoɯ ɹnoʎ 'sıɥʇ pɐǝɹ uɐɔ noʎ ɟı
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when using UNION

2011-12-16 Thread Simon Slavin

On 16 Dec 2011, at 2:20am, Igor Tandetnik wrote:

> Simon Slavin  wrote:
>> On 15 Dec 2011, at 7:19pm, Alexandr Němec wrote:
>> 
>>> [UNION]
>> 
>> Your 'ORDER BY' clause applies only to the second SELECT.
> 
> Not true.

Yeah, so I noticed from Richard's post.  Sorry for posting incorrect 
information everyone.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when using UNION

2011-12-15 Thread Igor Tandetnik
Simon Slavin  wrote:
> On 15 Dec 2011, at 7:19pm, Alexandr Němec wrote:
> 
>> just a quick question, I did not find the answer in the various technical 
>> documents. I have two identical tables with a id
>> INTEGER as a primary key, which means that SELECTions ORDERed BY id are very 
>> fast. Now if I do SELECT * FROM table1 UNION ALL
>> SELECT * FROM table2 ORDER by id, rows from both tables appear in the 
>> resultset. But I was surprised that the speed of the
>> sorting is still the same (as for one indexed column). Does it mean, that 
>> SQLite can (somehow) use the index to speed up the
>> sorting when UNIONing several tables?
> 
> Your 'ORDER BY' clause applies only to the second SELECT.

Not true. ORDER BY applies to the final resultset, after the union. Try adding 
an ORDER BY clause between the first SELECT and UNION ALL - you'll get an error.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when using UNION

2011-12-15 Thread Richard Hipp
2011/12/15 Alexandr Němec 

>
> Dear all,
>
> just a quick question, I did not find the answer in the various technical
> documents. I have two identical tables with a id INTEGER as a primary key,
> which means that SELECTions ORDERed BY id are very fast. Now if I do SELECT
> * FROM table1 UNION ALL SELECT * FROM table2 ORDER by id, rows from both
> tables appear in the resultset. But I was surprised that the speed of the
> sorting is still the same (as for one indexed column). Does it mean, that
> SQLite can (somehow) use the index to speed up the sorting when UNIONing
> several tables?
>

Yes.

The ORDER BY applies to the complete result of the UNION ALL.  So what
SQLite does in this case is run both subqueries in a parallel, delivering
the results of each subquery in sorted order (which is easy since the
source key is the primary key), and merge the results together.



>
> Thanks
> Alex
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>



-- 
D. Richard Hipp
d...@sqlite.org
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when using UNION

2011-12-15 Thread Simon Slavin

On 15 Dec 2011, at 7:19pm, Alexandr Němec wrote:

> just a quick question, I did not find the answer in the various technical 
> documents. I have two identical tables with a id INTEGER as a primary key, 
> which means that SELECTions ORDERed BY id are very fast. Now if I do SELECT * 
> FROM table1 UNION ALL SELECT * FROM table2 ORDER by id, rows from both tables 
> appear in the resultset. But I was surprised that the speed of the sorting is 
> still the same (as for one indexed column). Does it mean, that SQLite can 
> (somehow) use the index to speed up the sorting when UNIONing several tables?

Your 'ORDER BY' clause applies only to the second SELECT.  The rows from table1 
are not being sorted at all, they're appearing in whatever order SQLite finds 
them.  Which just happens to be 'id' order.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Index usage when using UNION

2011-12-15 Thread Alexandr Němec

Dear all,
 
just a quick question, I did not find the answer in the various technical 
documents. I have two identical tables with a id INTEGER as a primary key, 
which means that SELECTions ORDERed BY id are very fast. Now if I do SELECT * 
FROM table1 UNION ALL SELECT * FROM table2 ORDER by id, rows from both tables 
appear in the resultset. But I was surprised that the speed of the sorting is 
still the same (as for one indexed column). Does it mean, that SQLite can 
(somehow) use the index to speed up the sorting when UNIONing several tables?
 
Thanks
Alex
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when querying views

2011-09-25 Thread Igor Tandetnik
Nikolaus Rath  wrote:
> However, if I use an intermediate view:
> 
> sqlite>CREATE VIEW inode_blocks_v AS
>   SELECT * FROM inode_blocks
>   UNION
>   SELECT id as inode, 0 as blockno, block_id FROM inodes WHERE block_id 
> IS NOT NULL
> 
> and then run the same query on the view, SQLite scans through all
> involved tables:

A select from the view is transformed into

SELECT 1 FROM 
(SELECT * FROM inode_blocks
 UNION
 SELECT id as inode, 0 as blockno, block_id FROM inodes WHERE block_id IS NOT 
NULL
)
WHERE inode=42;

SQLite's optimizer isn't really that smart - definitely not smart enough to 
move the condition into the sub-select and duplicate it into each subquery. 
That's a rather non-trivial transformation.
-- 
Igor Tandetnik

___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage when querying views

2011-09-25 Thread Simon Slavin

On 25 Sep 2011, at 9:25pm, Nikolaus Rath wrote:

> However, if I use an intermediate view:
> 
> sqlite>CREATE VIEW inode_blocks_v AS
>   SELECT * FROM inode_blocks
>   UNION
>   SELECT id as inode, 0 as blockno, block_id FROM inodes WHERE block_id 
> IS NOT NULL
> 
> and then run the same query on the view, SQLite scans through all
> involved tables

I think it should work too.  Can you create indices on the views ?  Hmm.  No, I 
see SQLite doesn't do that.  I can only recommend you do both SELECTs and do 
the same thing with the results of each.

Simon.
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Index usage when querying views

2011-09-25 Thread Nikolaus Rath
Hello,

I have two tables, inodes and inode_blocks, which have indices on id and
inode respectively.

The following query seems to make optimal use of the indices:

sqlite> explain query plan
SELECT 1 from inode_blocks where inode=42
UNION
SELECT 1 from inodes where block_id is not null and id=42;
1|0|0|SEARCH TABLE inode_blocks USING COVERING INDEX 
sqlite_autoindex_inode_blocks_1 (inode=?) (~5 rows)
2|0|0|SEARCH TABLE inodes USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
0|0|0|COMPOUND SUBQUERIES 1 AND 2 USING TEMP B-TREE (UNION)

However, if I use an intermediate view:

sqlite>CREATE VIEW inode_blocks_v AS
   SELECT * FROM inode_blocks
   UNION
   SELECT id as inode, 0 as blockno, block_id FROM inodes WHERE block_id IS 
NOT NULL

and then run the same query on the view, SQLite scans through all
involved tables:

sqlite> explain query plan SELECT 1 FROM inode_blocks_v WHERE inode=42;
2|0|0|SCAN TABLE inode_blocks (~10711 rows)
3|0|0|SCAN TABLE inodes (~131030 rows)
1|0|0|COMPOUND SUBQUERIES 2 AND 3 USING TEMP B-TREE (UNION)
0|0|0|SCAN SUBQUERY 1 (~14174 rows)

Is there any way I can make SQLite use the indices here as well? I can't
see any reason of why they couldn't be used.


Best,

   -Nikolaus

-- 
 »Time flies like an arrow, fruit flies like a Banana.«

  PGP fingerprint: 5B93 61F8 4EA2 E279 ABF6  02CF A9AD B7F8 AE4E 425C
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage

2009-09-21 Thread Griggs, Donald
Matthew,

Regarding: "There's no way to optimize your query to be fast in both
situations."

I do *not* know if this would be of any help, but the newest 3.1.18
sqlite release which includes the SQLITE_ENABLE_STAT2 feature may
possibly be of interest:  (and excuse me if you've mentioned this
already)

http://sqlite.org/compile.html#enable_stat2
SQLITE_ENABLE_STAT2

This option adds additional logic to the ANALYZE command and to the
query planner that can help SQLite to chose a better query plan under
certain situations. The ANALYZE command is enhanced to collect a
10-sample histogram of the data in each index and store that histogram
in the sqlite_stat2 table. The query planner will then use the histogram
data to help it estimate how many rows will be selected by a range
constraint in a WHERE clause.  

Regards,
Donald Griggs
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage

2009-09-21 Thread Matthew L. Creech
On Mon, Sep 21, 2009 at 8:27 AM, Pavel Ivanov  wrote:
>
> There's no way to optimize your query to be fast in both situations.
> LIMIT clause is pretty hard to optimize. Maybe just to have a closer
> look at the application structure - maybe it's not so necessary to do
> ORDER BY or maybe LIMIT can be moved to inner query...
> But for this particular case I think it's pretty reasonable to use
> INDEXED BY clause despite what documentation says (it discourages
> usage for common cases).
>

Yeah, that's what I was afraid of.  :)  I guess I'll end up just
tracking the number of val_table entries which match each path, then
totaling up the # of matching entries first to get a count of how many
rows my real query is going to match.  Using that, and the LIMIT BY
items, I can maybe heuristically guess which indexing method will be
faster for when I do the real query.  Seems like a pain for this
relatively simple scenario, but I can see how it'd be deceptively
easy-looking to optimize.

Thanks for the response

-- 
Matthew L. Creech
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re: [sqlite] Index usage

2009-09-21 Thread Pavel Ivanov
> My question: how can I optimize this kind of query so that it utilizes
> both indexes, to grab the first [b] rows (ordered by time) which also
> match [a]?  Or am I just going to have to guess at which way will be
> faster, and use "INDEXED BY" to force it?  (The documentation says I
> shouldn't have to do this)

There's no way to optimize your query to be fast in both situations.
LIMIT clause is pretty hard to optimize. Maybe just to have a closer
look at the application structure - maybe it's not so necessary to do
ORDER BY or maybe LIMIT can be moved to inner query...
But for this particular case I think it's pretty reasonable to use
INDEXED BY clause despite what documentation says (it discourages
usage for common cases).

Pavel

On Fri, Sep 18, 2009 at 5:55 PM, Matthew L. Creech  wrote:
> Hi,
>
> I'm trying to optimize a query for 2 different scenarios, and I'm
> having trouble getting something that works good in general.  I want
> to be sure I'm not missing something.  Here are the tables and indexes
> used in my database:
>
> sqlite> CREATE TABLE path_table (idx INTEGER PRIMARY KEY, path TEXT UNIQUE);
> sqlite> CREATE TABLE val_table (idx INTEGER, val REAL, time INTEGER);
>
> sqlite> CREATE INDEX time_idx ON val_table (time ASC);
> sqlite> CREATE INDEX path_idx ON val_table (idx ASC);
>
> 'path_table' contains unique string path names, while 'val_table'
> records any number of values associated with each path, and the time
> at which the value occurred.
>
> My query looks something like:
>
> sqlite> SELECT val FROM val_table WHERE idx IN (SELECT idx FROM
> path_table WHERE path GLOB '[a]') ORDER BY time ASC LIMIT [b];
>
> where [a] and [b] are provided by my code's caller.  My problem occurs
> when the idx value(s) selected from 'path_table' match a large number
> of records in the database - say, 100,000 out of 1,000,000 records.
> In that case, the query takes several minutes to complete even when
> [b] is small.  Presumably it's first looking up all 100,000 rows where
> 'idx' matches, then applying the ORDER BY clause to those results
> without indexing.  EXPLAIN QUERY PLAN confirms:
>
> 0|0|TABLE val_table WITH INDEX path_idx
>
> I tried adding "INDEXED BY time_idx", which greatly improved this
> particular case, because statistically 1/10 rows will match 'idx' and
> therefore we find [b] of them very quickly when [b] is small.  But
> this hurts performance in other cases, since if there are only a few
> rows with a matching 'idx', the query ends up manually walking through
> most of the table.
>
> My question: how can I optimize this kind of query so that it utilizes
> both indexes, to grab the first [b] rows (ordered by time) which also
> match [a]?  Or am I just going to have to guess at which way will be
> faster, and use "INDEXED BY" to force it?  (The documentation says I
> shouldn't have to do this)
>
> Thanks for the help!
>
> --
> Matthew L. Creech
> ___
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
>
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


[sqlite] Index usage

2009-09-18 Thread Matthew L. Creech
Hi,

I'm trying to optimize a query for 2 different scenarios, and I'm
having trouble getting something that works good in general.  I want
to be sure I'm not missing something.  Here are the tables and indexes
used in my database:

sqlite> CREATE TABLE path_table (idx INTEGER PRIMARY KEY, path TEXT UNIQUE);
sqlite> CREATE TABLE val_table (idx INTEGER, val REAL, time INTEGER);

sqlite> CREATE INDEX time_idx ON val_table (time ASC);
sqlite> CREATE INDEX path_idx ON val_table (idx ASC);

'path_table' contains unique string path names, while 'val_table'
records any number of values associated with each path, and the time
at which the value occurred.

My query looks something like:

sqlite> SELECT val FROM val_table WHERE idx IN (SELECT idx FROM
path_table WHERE path GLOB '[a]') ORDER BY time ASC LIMIT [b];

where [a] and [b] are provided by my code's caller.  My problem occurs
when the idx value(s) selected from 'path_table' match a large number
of records in the database - say, 100,000 out of 1,000,000 records.
In that case, the query takes several minutes to complete even when
[b] is small.  Presumably it's first looking up all 100,000 rows where
'idx' matches, then applying the ORDER BY clause to those results
without indexing.  EXPLAIN QUERY PLAN confirms:

0|0|TABLE val_table WITH INDEX path_idx

I tried adding "INDEXED BY time_idx", which greatly improved this
particular case, because statistically 1/10 rows will match 'idx' and
therefore we find [b] of them very quickly when [b] is small.  But
this hurts performance in other cases, since if there are only a few
rows with a matching 'idx', the query ends up manually walking through
most of the table.

My question: how can I optimize this kind of query so that it utilizes
both indexes, to grab the first [b] rows (ordered by time) which also
match [a]?  Or am I just going to have to guess at which way will be
faster, and use "INDEXED BY" to force it?  (The documentation says I
shouldn't have to do this)

Thanks for the help!

-- 
Matthew L. Creech
___
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users


Re[2]: [sqlite] Index usage with LIKE queries

2007-09-12 Thread ser-ega
dc> If you are building an email indexing system, you problem
dc> want to use Full Text Search with fts3, not the LIKE
dc> operator. See
dc>http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex

Thanks, interesting, but seems it's not my case, cause I'm doing search by 
partial name/email matching for autocompleting text field, while user is typing 
the value.

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Index usage with LIKE queries

2007-09-12 Thread drh
<[EMAIL PROTECTED]> wrote:
> I thought I can create two separate indexes: on name and on email,
> and when I execute a query with "name LIKE 'value' OR email 
> LIKE 'value'" both indexes would be used.
> 

If you are building an email indexing system, you problem
want to use Full Text Search with fts3, not the LIKE
operator. See

   http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex

--
D. Richard Hipp <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Index usage with LIKE queries

2007-09-12 Thread drh
<[EMAIL PROTECTED]> wrote:
> IT> LIKE is case-insensitive by default. To have it use your index, you need 
> IT> to either make the index case-insensitive:
> IT> 
> IT> CREATE INDEX test_name ON test (name COLLATE NOCASE);
> 
> Sorry, tried to create the index this way, but it 
> still isn't used by the query.
> 

http://www.sqlite.org/optoverview.html#like_opt


--
D. Richard Hipp <[EMAIL PROTECTED]>


-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Index usage with LIKE queries

2007-09-12 Thread ser-ega
Hi

I'm trying to get index used with LIKE queries:
CREATE TABLE test (name STRING);
CREATE INDEX test_name ON test (name);
EXPLAIN QUERY PLAN SELECT * FROM test WHERE name = 'aaa';
The output is:
0|0|TABLE test WITH INDEX test_name
i.e. when LIKE is not used, the index is involved, everything is fine.
Then I'm trying LIKE query:
EXPLAIN QUERY PLAN SELECT * FROM test WHERE name LIKE 'aaa';
The output is:
0|0|TABLE test
i.e. the index is not used. Why? What am I doing wrong?

Thanks,
telega

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Index usage for inequalities and GROUP BY

2007-03-27 Thread Brownie

I have a table and indices as follows;

CREATE TABLE employee(name TEXT,salary INTEGER,job TEXT);
CREATE INDEX idx_emp_salary_job ON employee(salary,job);
CREATE INDEX idx_emp_job ON employee(job);

When I use the following query, SQLite seems to use idx_emp_salary_job
for both WHERE and GROUP BY.

SELECT job, COUNT(name) FROM employee WHERE salary=100 GROUP BY job;
0|0|TABLE employee WITH INDEX emp_salary_job ORDER BY

But when modify this query as follows, SQLite seems to use
idx_emp_salary_job for WHERE only. Not used for GROUP BY.

SELECT job, COUNT(name) FROM employee WHERE salary>=100 GROUP BY job;
0|0|TABLE employee WITH INDEX emp_salary_job

So, I modify it by using WHERE EXISTS and subqueries.

SELECT job, COUNT(name) FROM employee WHERE EXISTS
(SELECT * FROM employee WHERE salary>=100) GROUP BY job;
0|0|TABLE employee WITH INDEX emp_job ORDER BY
0|0|TABLE employee WITH INDEX emp_salary_job

It seems to use indices for both WHERE EXISTS and GROUP BY.

Is there more efficient way for the query that have inequalities and GROUP BY?

Regards,

-
To unsubscribe, send email to [EMAIL PROTECTED]
-



Re: [sqlite] Index usage

2006-06-20 Thread Paul Smith

At 14:25 20/06/2006, Mikey C wrote:


Hi,

I just wanted to ask for confirmation that my understanding on how the query
optimiser works is correct.

SQLite only uses one index for each table in a FROM?


Yes


What if tables are joined?  Does an index get used for each joined table?


No, just one index for the query. (It tries to pick the best one)


PaulVPOP3 - Internet Email Server/Gateway
[EMAIL PROTECTED]  http://www.pscs.co.uk/




[sqlite] Index usage

2006-06-20 Thread Mikey C

Hi,

I just wanted to ask for confirmation that my understanding on how the query
optimiser works is correct.

SQLite only uses one index for each table in a FROM?

What if tables are joined?  Does an index get used for each joined table?


So if I have

SELECT * 
FROM A
INNER JOIN B
ON A.COL1 = B.COL1
INNER JOIN C
ON C.COL1 = B.COL2

The query optimiser can use 3 indexes max?  One on A, B and C

Since B is joined on COL1 and COL2, only one join can use an index?

If a WHERE is added:

SELECT * 
FROM A
INNER JOIN B
ON A.COL1 = B.COL1
INNER JOIN C
ON C.COL1 = B.COL2
WHERE A.COL2 = 'fred'

Then again only one index can be used on table A?  Either the join or the
where?

Is my understanding correct?
--
View this message in context: 
http://www.nabble.com/Index-usage-t1817658.html#a4955210
Sent from the SQLite forum at Nabble.com.



Re: [sqlite] Index usage tracking

2006-06-08 Thread Dennis Jenkins
[EMAIL PROTECTED] wrote:
> Dennis Jenkins <[EMAIL PROTECTED]> wrote:
>   
>> I would like to know where the best place in sqlite is to patch to
>> have it record (syslog for unix, OutputDebugString() for windows,
>> nothing fancy) each time it decides to use an index to satisfy a query. 
>> 
>
> The index decisions are all made in where.c and there is already
> code in that file for printing out the decisions for testing
> purposes.  I suggest you search for SQLITE_TEST inside where.c,
> see that current testing code, and modify that to do whatever
> it is you want to do.
>
> --
> D. Richard Hipp   <[EMAIL PROTECTED]>
>
>   
Thank you very much.  I see the code that I need to tweak now.  ;)



Re: [sqlite] Index usage tracking

2006-06-08 Thread drh
Dennis Jenkins <[EMAIL PROTECTED]> wrote:
> 
> I would like to know where the best place in sqlite is to patch to
> have it record (syslog for unix, OutputDebugString() for windows,
> nothing fancy) each time it decides to use an index to satisfy a query. 

The index decisions are all made in where.c and there is already
code in that file for printing out the decisions for testing
purposes.  I suggest you search for SQLITE_TEST inside where.c,
see that current testing code, and modify that to do whatever
it is you want to do.

--
D. Richard Hipp   <[EMAIL PROTECTED]>



[sqlite] Index usage tracking

2006-06-08 Thread Dennis Jenkins
Hello all,

I would like to know where the best place in sqlite is to patch to
have it record (syslog for unix, OutputDebugString() for windows,
nothing fancy) each time it decides to use an index to satisfy a query. 
For that matter, also each time is runs a select query and decided to
not use an index.  This is only for in-house debugging.  Consider it
"coverage testing" of all of our SQL to determine if I have the correct
indicies.

I'd like the logging to record/emit the original SQL and the names
of the indicies (if any) used to execute that SQL.

Our app has grown and morphed over the past two years.  It has LOTS
of sql in it now.  Granted, I could isolate all of this sql (even the
dynamically generated stuff) (select, delete, update statements) and run
it through the analyzer.  However, if the above mentioned "hack" is easy
to do, then I would prefer the hack.  I'm trying to determine is all of
my indicies are actually being used, and to what frequency they are
being used during a typical run of our software.

I've been reading through "select.c" and "vdbe.c".  I'm not sure if
I should add the hack to the Virtual Machine opcode emitting code or the
opcode consuming code.  Maybe there is already a solution to my problem
and I simply didn't see it.



Re: [sqlite] Index Usage

2004-10-28 Thread D. Richard Hipp
William Hachfeld wrote:
Am I also correct in understanding that if I did:
CREATE INDEX MultiColumnIndex ON Example (begin, end, grp);
SELECT id FROM Example WHERE x < end AND y >= begin AND grp=g;
That I would only make use of 1 of the 3 terms in the index?
Correct.  Specifically the y>=begin term would be used.
The VDBE opcodes for my sub-select query looked almost identical to the non-
sub-select version. So I'm assuming that internally SQLite folds these together
and treats them, in effect, like a single query rather than a two-part query.
Yes.  Subqueries are folded into the main query where possible.  This
optimization is necessary to implement views efficiently.  A view is
really just an alias for a subquery.
At the risk of trying everyone's patience, I have one more question... Can any
generalizations be made about the relative performance of the following queries
(again using the same example table):
CREATE INDEX IndexA ON Example (grp, begin);
SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;
versus:
CREATE INDEX IndexA ON Example (grp);
CREATE INDEX IndexB ON Example (begin);
SELECT * FROM Example WHERE grp=g
INTERSECT SELECT * FROM Example WHERE x < end AND y >= begin;
or maybe even:
CREATE INDEX IndexA ON Example (grp, begin);
CREATE INDEX IndexB ON Example (end);

SELECT * FROM Example WHERE grp=g AND y >= begin
	INTERSECT SELECT * FROM Example WHERE x < end;

given a large (~1,000,000 rows) table? Is the cost of creating the temporary
table for the compound SELECT usually going to outweigh the benefit of using a
second index?
INTERSECT creates two temporary tables, not one.  I'm guessing the
first query would be faster.  But that is only a guess.  Try it and
see what you get.

--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565


Re: [sqlite] Index Usage

2004-10-28 Thread William Hachfeld

Thanks for the information Richard. Your explanation, along with the "Virtual
Database Engine" document that I was reading when you wrote this, makes things
a lot more clear. After reading Ulrik's suggestions, I decided to poke around a
little bit using "EXPLAIN" to see if I could discover what SQLite would do for
my two purposed queries... 

I was able to see in the VDBE opcodes for my first query exactly what you are
telling me - that SQLite will use the "grp" and "begin" terms only. So am I
also correct in understanding that if I did:

CREATE INDEX MultiColumnIndex ON Example (begin, end, grp);
SELECT id FROM Example WHERE x < end AND y >= begin AND grp=g;

That I would only make use of 1 of the 3 terms in the index?

The VDBE opcodes for my sub-select query looked almost identical to the non-
sub-select version. So I'm assuming that internally SQLite folds these together
and treats them, in effect, like a single query rather than a two-part query.

At the risk of trying everyone's patience, I have one more question... Can any
generalizations be made about the relative performance of the following queries
(again using the same example table):

CREATE INDEX IndexA ON Example (grp, begin);
SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;

versus:

CREATE INDEX IndexA ON Example (grp);
CREATE INDEX IndexB ON Example (begin);

SELECT * FROM Example WHERE grp=g
INTERSECT SELECT * FROM Example WHERE x < end AND y >= begin;

or maybe even:

CREATE INDEX IndexA ON Example (grp, begin);
CREATE INDEX IndexB ON Example (end);

SELECT * FROM Example WHERE grp=g AND y >= begin
INTERSECT SELECT * FROM Example WHERE x < end;

given a large (~1,000,000 rows) table? Is the cost of creating the temporary
table for the compound SELECT usually going to outweigh the benefit of using a
second index? Is there any way to force the temporary table to be placed in
main memory rather than on disk?

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Debugger, Object, and Performance Tools


Re: [sqlite] Index Usage

2004-10-28 Thread D. Richard Hipp
William Hachfeld wrote:
Hi,
Have a question for everyone regarding index usage in SQLite... Say that I have
the following database schema:
CREATE TABLE Example (
id INTEGER PRIMARY KEY,
grp INTEGER,
begin INTEGER,
end INTEGER
);
and I want to perform the following query:
SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;
on a large number of rows (say around one million) for some group 'g' and an
interval '[x, y)'. And, of course, with the assumption that (end > begin) for
all rows. Will my query performance be substantially improved by creating an
index such as:
CREATE INDEX MultiColumnIndex ON Example (grp, begin, end)
or will the operators "<" and ">=" prohibit SQLite from using the index?
SQLite uses inequalities in WHERE clause terms, but only for the
right-most used term of an index.  So in the case above, SQLite
will use the grp and begin columns of the index and ignore the end
column.  So the index
  CREATE INDEX multi ON Example(grp,begin)
would work just as well as the one that includes the third "end" column.
Also, I'm aware that SQLite supports multi-column indicies, but not the use of
multiple indicies per query. Is it possible to get around the later restriction
by expressing my above query using a sub-select:
SELECT id FROM (SELECT * FROM Example WHERE grp=g)
WHERE x < end AND y >= begin;
and then creating the following indicies instead:
CREATE INDEX GroupIndex ON Example (group)
CREATE INDEX IntervalIndex ON Example (begin, end)
And if so, can any generalizations be made regarding the performance of using
the two indicies versus the first, single, index? How about disk usage?
This won't help any.
Interval conditions are recognized as long as the same column
is used in both terms.  For example, this would help:
  ... WHERE y>=begin AND y= and the < terms,
SQLite has to choose one or the other, it cannot use both.
--
D. Richard Hipp -- [EMAIL PROTECTED] -- 704.948.4565


Re: [sqlite] Index Usage

2004-10-28 Thread William Hachfeld

Thanks for the advice Ulrik! 

I don't believe, however, that the alternate query you purposed using BETWEEN
is quite equivalent to what I was going to do. I am storing intervals [begin,
end) in the database and then looking for those intervals from the database
that intersect [x, y) - not those intervals contained by [x, y). A subtle, but
important, distinction in my application.

In any case, your information about SQLite's use (or lack there-of) of a
(begin, end) index is certainly helpful to me. Clearly I need to use the
EXPLAIN command and see if I can decipher what SQLite does under various
permutations of the query.

-- 
William Hachfeld  ([EMAIL PROTECTED], 651-683-3103)
SGI Debugger, Object, and Performance Tools


Re: [sqlite] Index Usage

2004-10-28 Thread Ulrik Petersen
Christian Jensen wrote:
I noticed that you use { instead of (
What do those do?
 

Sorry.  They were meant as pseudo-syntax so that he could insert 
whatever his own value was.  I did that because I didn't want him to write

BETWEEN 10 AND 15-1
but rather calculate the 15-1 inside his program, and then do
BETWEEN 10 AND 14
I guess I should have made that clear.
Cheers,
Ulrik


-Original Message-
From: Ulrik Petersen [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 28, 2004 10:28 AM
To: [EMAIL PROTECTED]
Subject: Re: [sqlite] Index Usage

William,
William Hachfeld wrote:
 

Hi,
Have a question for everyone regarding index usage in SQLite... Say 
that I have the following database schema:

  CREATE TABLE Example (
id INTEGER PRIMARY KEY,
grp INTEGER,
begin INTEGER,
end INTEGER
  );
and I want to perform the following query:
  SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;
on a large number of rows (say around one million) for some group 'g' 
and an interval '[x, y)'. And, of course, with the assumption that (end
   

 

begin) for all rows. Will my query performance be substantially 
 

improved by creating an index such as:
  CREATE INDEX MultiColumnIndex ON Example (grp, begin, end)
or will the operators "<" and ">=" prohibit SQLite from using the
   

index?
 


   

I have almost the same table in my linguistic database, Emdros
(http://emdros.org).  What I have found that works best for me is to put
an index on what you call "begin" (not a double index), then do
SELECT id
FROM Example
WHERE grp = g
   AND begin BETWEEN {x} AND {y-1}
   AND end BETWEEN {x} AND {y-1}.
For some strange reason, this is about 5% faster than what you were
proposing.  It could be because SQLite does not know that begin <= end,
and so can't make optimizations about when to stop looking.
 

Also, I'm aware that SQLite supports multi-column indicies, but not the
   

 

use of multiple indicies per query. Is it possible to get around the 
later restriction by expressing my above query using a sub-select:

  SELECT id FROM (SELECT * FROM Example WHERE grp=g)
WHERE x < end AND y >= begin;
and then creating the following indicies instead:
  CREATE INDEX GroupIndex ON Example (group)
  CREATE INDEX IntervalIndex ON Example (begin, end)
And if so, can any generalizations be made regarding the performance of
   

 

using the two indicies versus the first, single, index? How about disk
   

usage?
 


   

I cannot comment on this, except that I've run EXPLAIN on my versions of
the above queries, and found that SQLite wouldn't consult the "end" part
of the (begin,end) index.  Instead, it would consult the "end" part of
the table column, and then only use the "begin" part of the index.  At
least that's how I understood the EXPLAIN output, but I may be wrong.
The upshot of the above is that you can save diskspace by not doing the
double index, and only indexing "begin", since for these queries, the
"end" part is redundant (i.e., not used) in the index.
Cheers,
Ulrik
--
Ulrik Petersen, MA, B.Sc.
Emdros -- the text database engine for analyzed or annotated text
http://emdros.org/


 


--
Ulrik Petersen, MA, B.Sc.



RE: [sqlite] Index Usage

2004-10-28 Thread Christian Jensen
I noticed that you use { instead of (

What do those do?
 

-Original Message-
From: Ulrik Petersen [mailto:[EMAIL PROTECTED] 
Sent: Thursday, October 28, 2004 10:28 AM
To: [EMAIL PROTECTED]
Subject: Re: [sqlite] Index Usage

William,

William Hachfeld wrote:

>Hi,
>
>Have a question for everyone regarding index usage in SQLite... Say 
>that I have the following database schema:
>
>CREATE TABLE Example (
>   id INTEGER PRIMARY KEY,
>   grp INTEGER,
>   begin INTEGER,
>   end INTEGER
>);
>
>and I want to perform the following query:
>
>SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;
>
>on a large number of rows (say around one million) for some group 'g' 
>and an interval '[x, y)'. And, of course, with the assumption that (end

>> begin) for all rows. Will my query performance be substantially 
>improved by creating an index such as:
>
>CREATE INDEX MultiColumnIndex ON Example (grp, begin, end)
>
>or will the operators "<" and ">=" prohibit SQLite from using the
index?
>  
>
I have almost the same table in my linguistic database, Emdros
(http://emdros.org).  What I have found that works best for me is to put
an index on what you call "begin" (not a double index), then do

SELECT id
FROM Example
WHERE grp = g
AND begin BETWEEN {x} AND {y-1}
AND end BETWEEN {x} AND {y-1}.

For some strange reason, this is about 5% faster than what you were
proposing.  It could be because SQLite does not know that begin <= end,
and so can't make optimizations about when to stop looking.


>Also, I'm aware that SQLite supports multi-column indicies, but not the

>use of multiple indicies per query. Is it possible to get around the 
>later restriction by expressing my above query using a sub-select:
>
>SELECT id FROM (SELECT * FROM Example WHERE grp=g)
>   WHERE x < end AND y >= begin;
>
>and then creating the following indicies instead:
>
>CREATE INDEX GroupIndex ON Example (group)
>CREATE INDEX IntervalIndex ON Example (begin, end)
>
>And if so, can any generalizations be made regarding the performance of

>using the two indicies versus the first, single, index? How about disk
usage?
>  
>
I cannot comment on this, except that I've run EXPLAIN on my versions of
the above queries, and found that SQLite wouldn't consult the "end" part
of the (begin,end) index.  Instead, it would consult the "end" part of
the table column, and then only use the "begin" part of the index.  At
least that's how I understood the EXPLAIN output, but I may be wrong.

The upshot of the above is that you can save diskspace by not doing the
double index, and only indexing "begin", since for these queries, the
"end" part is redundant (i.e., not used) in the index.

Cheers,

Ulrik

--
Ulrik Petersen, MA, B.Sc.
Emdros -- the text database engine for analyzed or annotated text
http://emdros.org/






Re: [sqlite] Index Usage

2004-10-28 Thread Ulrik Petersen
William,
William Hachfeld wrote:
Hi,
Have a question for everyone regarding index usage in SQLite... Say that I have
the following database schema:
   CREATE TABLE Example (
id INTEGER PRIMARY KEY,
grp INTEGER,
begin INTEGER,
end INTEGER
   );
and I want to perform the following query:
   SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;
on a large number of rows (say around one million) for some group 'g' and an
interval '[x, y)'. And, of course, with the assumption that (end > begin) for
all rows. Will my query performance be substantially improved by creating an
index such as:
   CREATE INDEX MultiColumnIndex ON Example (grp, begin, end)
or will the operators "<" and ">=" prohibit SQLite from using the index?
 

I have almost the same table in my linguistic database, Emdros 
(http://emdros.org).  What I have found that works best for me is to put 
an index on what you call "begin" (not a double index), then do

SELECT id
FROM Example
WHERE grp = g
   AND begin BETWEEN {x} AND {y-1}
   AND end BETWEEN {x} AND {y-1}.
For some strange reason, this is about 5% faster than what you were 
proposing.  It could be because SQLite does not know that begin <= end, 
and so can't make optimizations about when to stop looking.


Also, I'm aware that SQLite supports multi-column indicies, but not the use of
multiple indicies per query. Is it possible to get around the later restriction
by expressing my above query using a sub-select:
   SELECT id FROM (SELECT * FROM Example WHERE grp=g)
WHERE x < end AND y >= begin;
and then creating the following indicies instead:
   CREATE INDEX GroupIndex ON Example (group)
   CREATE INDEX IntervalIndex ON Example (begin, end)
And if so, can any generalizations be made regarding the performance of using
the two indicies versus the first, single, index? How about disk usage?
 

I cannot comment on this, except that I've run EXPLAIN on my versions of 
the above queries, and found that SQLite wouldn't consult the "end" part 
of the (begin,end) index.  Instead, it would consult the "end" part of 
the table column, and then only use the "begin" part of the index.  At 
least that's how I understood the EXPLAIN output, but I may be wrong.

The upshot of the above is that you can save diskspace by not doing the 
double index, and only indexing "begin", since for these queries, the 
"end" part is redundant (i.e., not used) in the index.

Cheers,
Ulrik
--
Ulrik Petersen, MA, B.Sc.
Emdros -- the text database engine for analyzed or annotated text
http://emdros.org/



RE: [sqlite] Index Usage

2004-10-27 Thread Christian Jensen
Great Question! I am eager to hear the response! I use a ton of JOINs
and INTERSECTs. Coverage on that topic would be great too! 

-Original Message-
From: William Hachfeld [mailto:[EMAIL PROTECTED] 
Sent: Wednesday, October 27, 2004 9:34 AM
To: [EMAIL PROTECTED]
Subject: [sqlite] Index Usage


Hi,

Have a question for everyone regarding index usage in SQLite... Say that
I have the following database schema:

CREATE TABLE Example (
id INTEGER PRIMARY KEY,
grp INTEGER,
begin INTEGER,
end INTEGER
);

and I want to perform the following query:

SELECT id FROM Example WHERE grp=g AND x < end AND y >= begin;

on a large number of rows (say around one million) for some group 'g'
and an interval '[x, y)'. And, of course, with the assumption that (end
> begin) for all rows. Will my query performance be substantially
improved by creating an index such as:

CREATE INDEX MultiColumnIndex ON Example (grp, begin, end)

or will the operators "<" and ">=" prohibit SQLite from using the index?

Also, I'm aware that SQLite supports multi-column indicies, but not the
use of multiple indicies per query. Is it possible to get around the
later restriction by expressing my above query using a sub-select:

SELECT id FROM (SELECT * FROM Example WHERE grp=g)
WHERE x < end AND y >= begin;

and then creating the following indicies instead:

CREATE INDEX GroupIndex ON Example (group)
CREATE INDEX IntervalIndex ON Example (begin, end)

And if so, can any generalizations be made regarding the performance of
using the two indicies versus the first, single, index? How about disk
usage?

Thanks in advance for any information regarding the above!

-- William Hachfeld