Have you done "ANALYZE"?  That might help.

Also...try to arrange your joins based on record count (both high-to-low and 
low-to-high) and see what difference it makes.

Since you have only one WHERE clause I'm guessing having project_ids as the 
first join makes sense.

Michael D. Black
Senior Scientist
NG Information Systems
Advanced Analytics Directorate




From: sqlite-users-boun...@sqlite.org [sqlite-users-boun...@sqlite.org] on 
behalf of Puneet Kishor [punk.k...@gmail.com]
Sent: Tuesday, September 27, 2011 5:46 PM
To: General Discussion of SQLite Database
Subject: EXT :[sqlite] speeding up FTS4


I have

    CREATE TABLE uris (
        uri_id INTEGER PRIMARY KEY,
        uri TEXT,
        uri_content TEXT,
        downloaded_on DATETIME DEFAULT CURRENT_TIMESTAMP,
        feed_history_id INTEGER
    );

with 46608 rows

    CREATE TABLE feed_history (
        feed_history_id INTEGER PRIMARY KEY,
        feed_id INTEGER,
        scanned_on DATETIME DEFAULT CURRENT_TIMESTAMP
    );

with 3276 rows

    CREATE TABLE feeds (feed_id INTEGER PRIMARY KEY, feed_uri TEXT, project_id 
INTEGER);

with 79 rows

    CREATE TABLE projects (project_id INTEGER PRIMARY KEY, project_name TEXT);

with 3 rows

    CREATE INDEX idx_uris_downloaded_on ON uris (downloaded_on);
    CREATE VIRTUAL TABLE fts_uri USING fts4 (uri_id, uri_content);

The database file is about 27 GB.


The following query takes (CPU Time: user 23.952698 sys 73.188765) returning 
46608 rows

    SELECT u.uri_id uri_id, u.uri uri, u.downloaded_on,
        Snippet(fts_uri, '<span class="hilite">', '</span>', '&hellip;', -1, 
64) snippet
    FROM fts_uri f
        JOIN uris u ON f.uri_id = u.uri_id
        JOIN feed_history fh ON u.feed_history_id = fh.feed_history_id
        JOIN feeds f ON fh.feed_id = f.feed_id
        JOIN projects p ON f.project_id = p.project_id
    WHERE p.project_id = 3
        AND Datetime(u.downloaded_on) >= Datetime(p.project_start)
        AND fts_uri MATCH 'education,school'
    ORDER BY u.uri_id, downloaded_on DESC;


The EXPLAIN QUERY PLAN for the above query tells me

    0|0|0|SCAN TABLE fts_uri AS f VIRTUAL TABLE INDEX 4: (~0 rows)
    0|1|1|SEARCH TABLE uris AS u USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
    0|2|2|SEARCH TABLE feed_history AS fh USING INTEGER PRIMARY KEY (rowid=?) 
(~1 rows)
    0|3|3|SEARCH TABLE feeds AS f USING INTEGER PRIMARY KEY (rowid=?) (~1 rows)
    0|4|4|SEARCH TABLE projects AS p USING INTEGER PRIMARY KEY (rowid=?) (~1 
rows)
    0|0|0|USE TEMP B-TREE FOR ORDER BY


Is there anything I can do to speed this up?

--
Puneet Kishor
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to