Hi,

We've been struggling with a complex query that we have written. One of the elements of this complex query is a select statement that doesn't appear to use an index when we *think* it should do. We're not going to ask you to debug a large complex SQL query (unless you have nothing else to do today) but we're working our way through the query trying to understand where we've gone wrong. So we've broken down the query and are trying each section to see what it performs (or rather doesn't perform like).


The sub query is a simple select on a large table, Disruptions. The Disruptions table has 180M rows of data.

The schema for it is here. We've pulled it straight from Navicat for SQLite.

```
CREATE TABLE "Disruptions" (
         "id" INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
         "version" integer NOT NULL,
         "Disruption_id" INTEGER NOT NULL,
         "status" integer NOT NULL,
         "severity" integer NOT NULL,
         "levelOfInterest" integer NOT NULL,
         "category" integer NOT NULL,
         "subCategory" integer NOT NULL,
         "startTime" TEXT NOT NULL,
         "endTime" text NOT NULL,
         "location" integer NOT NULL,
         "corridor" integer NOT NULL,
         "comments" integer NOT NULL,
         "currentUpdate" integer NOT NULL,
         "remarkTime" TEXT NOT NULL,
         "lastModTime" TEXT NOT NULL,
         "CauseAreaPointX" real NOT NULL,
         "CauseAreaPointY" real NOT NULL,
         "Direction" TEXT
);
INSERT INTO "main".sqlite_sequence (name, seq) VALUES ("Disruptions", '184626834');

-- ----------------------------
--  Indexes structure for table Disruptions
-- ----------------------------
CREATE INDEX "Disruptions_Idx1" ON Disruptions ("location" COLLATE NOCASE ASC, "corridor" COLLATE NOCASE ASC, "status" COLLATE NOCASE ASC, "category" COLLATE NOCASE ASC, "severity" COLLATE NOCASE ASC, "levelOfInterest" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC, "startTime" COLLATE NOCASE ASC); CREATE UNIQUE INDEX "Disruptions_Idx2" ON Disruptions ("Disruption_id" COLLATE NOCASE ASC, "version" COLLATE NOCASE ASC, "category" COLLATE NOCASE ASC, "subCategory" COLLATE NOCASE ASC); CREATE UNIQUE INDEX "Disruptions_Idx3" ON Disruptions ("version" COLLATE NOCASE ASC, "Disruption_id" COLLATE NOCASE ASC, "location" COLLATE NOCASE ASC); CREATE INDEX "Disruptions_idx4" ON Disruptions ("status" COLLATE NOCASE ASC);

PRAGMA foreign_keys = true;
```

As part of the larger more complex query, we are executing the query

```
select * from Disruptions where status = 2 OR status = 6;
```

Status is one of six values, 1 to 6 and is probably not evenly distributed across the 180M rows.

If we do

```
sqlite> explain query plan select * from Disruptions where status = 2 OR status = 6;
selectid|order|from|detail
0|0|0|SCAN TABLE Disruptions
```

We can see that table scanning a 180M records is going to be slow, no matter what the rest of the complex query is like.

We have an index Disruptions_idx4 which we *think* should speed it up, but the query plan doesn't seem to take this into account.

We think that only having six values of Status means that the speed up from the index is not going to be fantastic but every little helps.

We have run analyze on the database and that hasn't helped.

Our gut feeling at this moment is that we have the design structure wrong in our DB and we're going to have to take a long hard look at what we're doing, do a redesign and a rebuild as we simply got it wrong from the beginning. Hindsight is wonderful ;) In the interim (as this is a 2-3 month job), we need to speed our query up from 90 mins down to something in the tens of mins.

Any suggestions very much welcomed,

Thanks

Rob
_______________________________________________
sqlite-users mailing list
sqlite-users@mailinglists.sqlite.org
http://mailinglists.sqlite.org/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to