Thank you Simon for responding to my questions. Your phonebook (FName/LName)
analogy clearly explained why 2 indices per table per select won't work.

Let me provide a bit more info and a possible attempt to implement your
suggestions for better indices.

 

My 'bread and butter' query counts the frequency of HCPSCD occurrences for
each county of interest in each state of interest. [Occasionally I want
counts for entire state.] 

I generally have 6 groups of  1~15 related HCPSCD codes and want counts from
1-12 counties of a state. 

 

CLAIMS table columns: claim_no, state_cd, cnty_cd are columns of interest;
40 other columns rarely used.

LINE table columns: claim_no, hcpscd, plus 25 more columns (these are
repeating fields per claim)

 

[FYI: DB contains 44M claim numbers with 1~12 HCPSCD codes per claim,
several thousand HCPSCD codes of which I am interested in 44 in 6 groups, 50
states with 9~125 counties per state of which I am usually interested in
~10.]

 

I currently do a query like this .

SELECT COUNT(HCPSCD) CNT, HCPSCD, STATE_CD, CNTY_CD FROM CLAIMS , LINE

WHERE CLAIMS.CLAIM_NO == LINE.CLAIM_NO AND        

                                HCPSCD IN
('78451','78452','78453','78454','78469','78472','78473','78481','78483','78
494','78496','78499')   -- this is first group of hcpscd codes always
queried

                                AND

                                STATE_CD = '21'

                                AND

                                CNTY_CD IN ('220', '345', '570')

                                                GROUP BY CNTY_CD

UNION                 {repeat above SELECT with second group of hcpscd codes
for same state and county . and so on 4 more times}

 

So, to implement your suggestion of crafting better indices, here's my
approach:

1.       First action is joining the 2 tables on claim_no. {Therefore
claim_no should be first row in index for both tables}

2.       Find state and county in claims table

3.       Find HCPSCD in line table

So my 2 combined indices would be:

                CREATE INDEX idx_Claim_State_Cnty ON CLAIMS (CLAIM_NO,
STATE_CD, CNTY_CD)

                CREATE INDEX idx_Line_hcpscd ON LINE (CLAIM_NO, HCPSCD)

 

Is this what you were suggesting? Will these indices produce improved
performance from the single column indices I was using?

Thanks so much.

Peter

===========================================================

> 1.       Can SQLite use > 1 index per table per SELECT? Seems like using

> both indices for each table would be better than the single index per
table

> as chosen by the plan.

 

No, you?re right: one index per table, per SELECT.  Pretend you have a phone
book with two indexes: surname and first name.  You have to look up someone
called Miri Kallas.  You can use either of the indexes to look up one name
but once you?ve done that the other index is useless to you.

 

You have made up some indexes which are useful but not the most useful.
Drop those indexes and try to figure out one compound index on each table
which would be best for the SELECT you asked about.  Remember that this

 

CREATE INDEX it1 ON t1 (c1)

CREATE INDEX it2 ON t1 (c2)

 

does not do the same thing as

 

CREATE INDEX it12 ON t1 (c1, c2)

 

Can?t do it for you because I can?t tell which of your columns are from
which table.

 

> 2.       Is using something like "SELECT . FROM LINE INDEXED BY claim_no

> AND INDEXED BY hcpscd" possible? What is the correct syntax to include 2

> INDEXED BY this way?

 

If you have to tell SQLite which index to use you?re doing it wrong.  Make
up a good index and SQLite will decide to use it.  Especially if you?ve done
ANALYZE.

 

> 3.       Is there a better way to write this query, for example, would

> rearranging the order of AND criteria in WHERE clause improve performance?

 

The query optimizer is meant to do all that for you.  However, I suspect
that you may understand your query better if you get rid of some of your
'IN' clauses.  If you imagine doing 24 (= 2 * 12) different SELECTs, one for
each State and HCPSCD, what would your SELECT look like then ?  With good
indexes it should be possible to make each of those SELECTs execute
ridiculously fast.

 

Once you?ve figured out how to do that and make it run fast, /then/ you
might want to recombine the query for each State, though perhaps not put
both States in the same query.

 

> 4.       How do I interpret the 'order' and 'from' in the query plan

> results?

 

It?s showing you what each of your indexes is being used for.  And what is
shows is the neither index is being used for both selecting records and
arranging the order of results.

 

Simon.

_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to