RE: [sqlite] Re: Re: Re: Re: How does SQLite choose the index?

2007-08-04 Thread RB Smissaert
Thanks; I have seen this O(N) etc. explanations a lot, but not sure
what they exactly mean.
Does it in this case simply mean O * N and O * (m log N) ?

> and for each entry would perform a logN

Does the logN here mean m log N or something else?

> m==N/logN

Ditto, does this mean break even point roughly when m equals N / (m log N) ?

Sorry, these might be basic questions, but would like to get this clear.


RBS

-Original Message-
From: Igor Tandetnik [mailto:[EMAIL PROTECTED] 
Sent: 04 August 2007 20:01
To: SQLite
Subject: [sqlite] Re: Re: Re: Re: How does SQLite choose the index?

RB Smissaert <[EMAIL PROTECTED]>
wrote:
> One thing I am not sure about yet is when an index would be helpful
> in the
> first place in relation to the data in the field.
> I understand an index is going to help little if the values in a
> particular
> field can only for example be 1 or 0, but roughly when does it become
> useful
> to add an index?

Suppose you have a table with N records. You run a query like "select * 
from t where f='x'; " which selects m records. Without an index on t(f), 
the query would run in O(N) time. With the index, it would be O(m log N) 
(it will scan m entries in the index, and for each entry would perform a 
logN lookup in the main table, by rowid).

Thus, when m is close to N (that is, the query selects almost all 
records), an index actually performs worse than a linear scan. The 
break-even point is somewhere on the order m==N/logN.

Igor Tandetnik 



-
To unsubscribe, send email to [EMAIL PROTECTED]

-




-
To unsubscribe, send email to [EMAIL PROTECTED]
-



[sqlite] Re: Re: Re: Re: How does SQLite choose the index?

2007-08-04 Thread Igor Tandetnik

RB Smissaert <[EMAIL PROTECTED]>
wrote:

One thing I am not sure about yet is when an index would be helpful
in the
first place in relation to the data in the field.
I understand an index is going to help little if the values in a
particular
field can only for example be 1 or 0, but roughly when does it become
useful
to add an index?


Suppose you have a table with N records. You run a query like "select * 
from t where f='x'; " which selects m records. Without an index on t(f), 
the query would run in O(N) time. With the index, it would be O(m log N) 
(it will scan m entries in the index, and for each entry would perform a 
logN lookup in the main table, by rowid).


Thus, when m is close to N (that is, the query selects almost all 
records), an index actually performs worse than a linear scan. The 
break-even point is somewhere on the order m==N/logN.


Igor Tandetnik 



-
To unsubscribe, send email to [EMAIL PROTECTED]
-