Re: [sqlite] Primary key used in compound index

Pavel Vazharov Wed, 26 Mar 2014 10:43:36 -0700

Hi,

*******************************************************************************************************************
A) What version of SQLite are you using ?
- We are using SQLite version 3.6.17
- We are using SQLite through the C++ SOCI library - 3.1.0

*******************************************************************************************************************
C) How many rows are there in that table ?
In my first email I posted simplified version of the problem. In order to avoid 
a possible misleading it is probably better to post results from the real 
version. IMO, the problem is the same, but I could be wrong. So here is the 
real database table:

CREATE TABLE "event" (
"id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
"type" INTEGER NOT NULL,
"current" INTEGER NOT NULL DEFAULT 1,
"timestamp" INTEGER NOT NULL,
"file_id" INTEGER,
"folder_id" INTEGER,
"deleted" INTEGER NOT NULL,
"previous_name" TEXT,
"previous_folder_id" INTEGER,
"rsub_id" INTEGER,
"person_id" INTEGER,
"guest_id" INTEGER
, transparent INTEGER NOT NULL DEFAULT 0, major integer not null default 0);

CREATE INDEX IDX_event_1 ON event (deleted, major, id);

CREATE TRIGGER update_event
AFTER UPDATE ON event
FOR EACH ROW
WHEN (old.type != new.type)
BEGIN
        update event set major = new.type in (3, 4, 5, 6, 7, 8, 9) where id = 
new.id;
END;

CREATE TRIGGER insert_event
AFTER INSERT ON event
BEGIN
        update event set major = new.type in (3, 4, 5, 6, 7, 8, 9) where id = 
new.id;
END;

CREATE INDEX IDX_event_2 on event (file_id, deleted, id);

=========================================================

select max(id) from event;  - 366899
select count(id) from event where deleted = 0 and major = 0; - 326564
select count(id) from event where deleted = 0 and major = 1; - 34899

*******************************************************************************************************************
B) What is the result of EXPLAIN SELECT ?
I'm posting the results of EXPLAIN QUERY PLAN and EXPLAIN from the real table 
in order to avoid misleading.

=============================================================
explain query plan select e.type, e.id, e.rsub_id, e.person_id, e.timestamp, 
e.file_id, e.previous_name, e.previous_folder_id, e.transparent, e.folder_id 
from event e where e.deleted = 0 and major = 0 and e.id > 330557 order by e.id 
asc limit 1;

selectid       order   from    detail
0       0       0       SEARCH TABLE event AS e USING INDEX IDX_event_1 
(deleted=? AND major=?)

=============================================================
explain select e.type, e.id, e.rsub_id, e.person_id, e.timestamp, e.file_id, 
e.previous_name, e.previous_folder_id, e.transparent, e.folder_id from event e 
where e.deleted = 0 and major = 0 and e.id > 330557 order by e.id asc limit 1

addr   opcode  p1      p2      p3      p4      p5      comment
0       Trace   0       0       0       ""      00      
1       Noop    0       0       0       ""      00      
2       Integer 1       1       0       ""      00      
3       Goto    0       29      0       ""      00      
4       OpenRead        0       2       0       14      00      
5       OpenRead        2       9       0       k(4,B,B,B,B)    00      
6       Integer 0       2       0       ""      00      
7       Integer 0       3       0       ""      00      
8       SeekGe  2       26      2       2       00      
9       IdxGE   2       26      2       2       01      
10      IdxRowid        2       4       0       ""      00      
11      Seek    0       4       0       ""      00      
12      Le      6       25      4       ""      6c      
13      Column  0       1       7       ""      00      
14      IdxRowid        2       8       0       ""      00      
15      Column  0       9       9       ""      00      
16      Column  0       10      10      ""      00      
17      Column  0       3       11      ""      00      
18      Column  0       4       12      ""      00      
19      Column  0       7       13      ""      00      
20      Column  0       8       14      ""      00      
21      Column  0       12      15      0       00      
22      Column  0       5       16      ""      00      
23      ResultRow       7       10      0       ""      00      
24      IfZero  1       26      -1      ""      00      
25      Next    2       9       0       ""      00      
26      Close   0       0       0       ""      00      
27      Close   2       0       0       ""      00      
28      Halt    0       0       0       ""      00      
29      Transaction     0       0       0       ""      00      
30      VerifyCookie    0       26      0       ""      00      
31      TableLock       0       2       0       event   00      
32      Integer 330557  6       0       ""      00      
33      Goto    0       4       0       ""      00      

*******************************************************************************************************************
D) Are you testing this inside your own application or using the SQLite shell 
tool ?

Here is the pseudo/real code of my testing scenario.

agent_root_revision.minor_rev = 330556;
agent_root_revision.major_rev = 330557;

auto beg = boost::chrono::high_resolution_clock::now();

enum { count_iterations = 1000 };
for (int i = 0; i < count_iterations; ++i)
{
        int j = 300 * i; // i'm using this in order to avoid returning results 
from the disk cache

        // pseudo code the query is the same as above with parameters id and 
major , deleted is hardcoded to where deleted = 0
       SELECT 1 EVENT DATA WHERE ID > (agent_root_revision.major_rev-j) and 
major = 1;

        // pseudo code the query is the same as above with parameters id and 
major , deleted is hardcoded to where deleted = 0
       SELECT 1 EVENT DATA WHERE ID > (agent_root_revision.minor_rev-j) and 
major = 0;
}          

auto end = boost::chrono::high_resolution_clock::now();

auto dur = boost::chrono::duration_cast<boost::chrono::milliseconds>(end - beg);

LOG(log_, lg_normal) << "AGENT REV: " << agent_root_revision << ". TIME FOR " 
<< count_iterations << " ITERATIONS: " << dur;

==========================================================================================
and the results are (I did more than 10 measurements) with the original table 
and with the modified table (with dup_id)
The results are in the same range for the two cases.

original table result - 2014-Mar-26 09:53:38.247243:NORM: AGENT REV: 
330557:330556. TIME FOR 1000 ITERATIONS: 38828 milliseconds

modified table result - 2014-Mar-26 10:02:57.000232:NORM: AGENT REV: 
330557:330556. TIME FOR 1000 ITERATIONS: 252 milliseconds

Note that in the second case only the query is modified to use the dup_id:
select e.type, e.id, e.rsub_id, e.person_id, e.timestamp, e.file_id, 
e.previous_name, e.previous_folder_id, e.transparent, e.folder_id from event e 
where e.deleted = 0 and major = 0 and e.dup_id > 330557 order by e.dup_id asc 
limit 1;
*******************************************************************************************************************
Some additional information.
If I run ANALYZE on this db. The EXPLAIN QUERY PLAN returns different that a 
different index will be used (just by id)
selectid       order   from    detail
0       0       0       SEARCH TABLE event AS e USING INTEGER PRIMARY KEY 
(rowid>?)

*******************************************************************************************************************
Let me know if you need some more information that I can provide.

I'm about to test the proposed - "id" INTEGER PRIMARY KEY. I'll post what are 
the results with it.

I'll ask our guys if it is ok to upload the db on your server.

Thank you for the help,
Pavel.
________________________________________
From: [email protected] <[email protected]> on 
behalf of Graham Holden <[email protected]>
Sent: Wednesday, March 26, 2014 8:57 AM
To: General Discussion of SQLite Database
Subject: Re: [sqlite] Primary key used in compound index

The assertion that AUTOINCREMENT has no effect (here and in another post) is 
incorrect.  Without it, SQLite MAY reuse a key from deleted rows; with it, this 
will not happen.

Sent from Samsung Galaxy Note

-------- Original message --------
From: Simon Slavin <[email protected]>
Date: 26/03/2014  12:50  (GMT+00:00)
To: General Discussion of SQLite Database <[email protected]>
Subject: Re: [sqlite] Primary key used in compound index

On 25 Mar 2014, at 8:19pm, Pavel Vazharov <[email protected]> wrote:

> "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,

Can you try this as exactly the following:

"id" INTEGER PRIMARY KEY,

leaving everything else out.  I assure you that it will obey the 'NOT NULL' and 
'AUTOINCREMENT' anyway.  The reason is that SQLite has explicit understanding 
of exactly that phrase for primary keys.

> "type" INTEGER NOT NULL,
> "some_integer_data" INTEGER NOT NULL,
> "some_string_data" TEXT,
> "deleted" INTEGER NOT NULL,
> major INTEGER NOT NULL DEFAULT 0
> );
> CREATE INDEX IDX_test_event_1 ON test_event (deleted, major, id);
>
> When I execute:
>
>
> explain query plan select * from test_event where deleted = 0 and major = 1 
> and id > 5 order by id asc limit 1;
>
> It returns:
>
>
> ?selectid order from detail
> 0 0 0 SEARCH TABLE test_event USING INDEX IDX_test_event_1 (deleted=? AND 
> major=?)
>
> So I can see that the id is not used, only delete and major part of the 
> compound index are used. In our scenario this leads to slow selects, really 
> slow.

Not sure whether the id column should be mentioned but the index you created is 
an excellent index for that SELECT, so you should be getting extremely fast 
results.  Can you try using EXPLAIN instead of EXPLAIN QUERY PLAN ?  It will 
tell you more about how the query is being handled.

> And the select statement runs about 100 times faster than the previous one.

I agree.  Makes no sense.

> My question is: Is there a way the primary key to be used as a part of the 
> compound index?

You should think of SQLite sneakily adding the primary key onto the end of 
every index you declare, whether you ask for it or not.  When devising 
searching and sorting methods, SQLite knows it's there and should be using it 
correctly.

> Adding additional column works, but it will increase the database size and 
> will slow down the insert statements because this additional column needs to 
> be updated with trigger or with additional query.

I agree this is annoying and we should be able to solve your strange problem.

Can you post the following information:

A) What version of SQLite are you using ?
B) What is the result of EXPLAIN SELECT ?
C) How many rows are there in that table ?
D) Are you testing this inside your own application or using the SQLite shell 
tool ?

If you want to make it really easy for us to test this, could you put up some 
test data on a server somewhere ?  But we may not need it so don't work too 
hard on it at this stage.

Note to others: I get the same results from XQP using SQLite 3.7.13 and no data 
in the table.

Simon.
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
[email protected]
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Re: [sqlite] Primary key used in compound index

Reply via email to