Sqlite would need to know if the file was cached or not to make the  
right decision.

The big web site where every user has their own db is a perfect example.
Assume that after a user logs in that their db gets cached (because  
they do many queries) and they do some aggregation, hence it runs fast  
due to the index. Every week the system does the same aggregation over  
ALL db files for billing, hence it loops over all files and nothing is  
in the file cache. This will run 25 times slower with the index.  
Sqlite cannot know the difference between the two cases.

Using many sqlite dbs as partitions (like above for the web site) is a  
great way to scale. The application will know if a table scan is  
better or not. Sqlite itself does not have the view.



On Sep 26, 2008, at 1:23 PM, Nicolas Williams  
<[EMAIL PROTECTED]> wrote:

> On Fri, Sep 26, 2008 at 12:54:36PM -0400, Russell Leighton wrote:
>> I need a 2 key index for some queries and also want to aggregate on
>> these 2 columns. I need this index BUT I have many large sqlite dbs I
>> iterate over and they won't fit in the filesystem cache. Run time  
>> when
>> the index is present is 105min. Run time with out the index is 3min.
>>
>> I see no way a simple query planner can account for factors like
>> available ram, disk io speeds and CPU speeds. The solution DRH
>> suggests is perfect for my needs.
>
> First, CPU speed is probably not an issue here.  Knowing the size of  
> the
> tables and indexes relative to RAM/cache size sure is relevant though.
>
> Given knowledge of table row counts, why couldn't SQLite3 recognize  
> that
> your query is best planned to do a full table scan?
> _______________________________________________
> sqlite-users mailing list
> sqlite-users@sqlite.org
> http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to