On 20 Jan 2015, at 12:12pm, Parakkal, Navin S (Software Engineer) 
<navin.parak...@hp.com> wrote:

>    When I do a select count(*) on hp_table1 it takes more than 5 mins which 
> is quite a huge time.

If this is a table for which rows are inserted but never deleted, then you will 
find that

SELECT max(rowid) FROM hp_table1

returns the same value almost immediately.  Perhaps value-1, but whatever it is 
it'll be consistent.

> Now I create a index [snip]

> Now I do select count(*) on hp_table1 , it takes around 15 secs.
> 

> [...]

> 

> The size of database is around 18GB (empty_database) with rows. Now I do a 
> select count(*) on empty_table1 (actually it contains lots of rows like 100M+ 
> records ) and it takes more than 5 mins. 5 mins is too much of a time for us 
> to bear. The customer wants the information within a minute.

What information ?  The number of rows in a table ?  That's rarely important 
information for a user.  It looks more like the kind of information a database 
admin would want.

The problem is that an unusual aspect of how SQLite works means that the total 
number of rows for a table is not stored anywhere.  To calculate count(*) it 
has to look at every row in the table.  It can do this by looking at every row 
in a table and counting the entries, which is what it did originally to take 5 
minutes.

But a full index on the table has the same number of entries but involves 
handling less data, and SQLite knows that counting the entries in the index you 
created would be faster.  So once you had created the index to find count(*) it 
went through all the entries in that index instead, which took it just 15 
seconds.

> How do we go about resolving this ie what should we do to create a table with 
> sub minute access for 100-500 million .

Access is not the problem here.  SQLite can access any row in a table that big 
in a few milliseconds.  The problem here is the specific function of counting 
every row which is something SQLite does not do quickly.

Take a look at the 'max(rowid)' trick I showed above.

If that's no good for you, you use TRIGGERs which add 1 for each INSERT and 
delete 1 for each DELETE to keep track of the number of rows.  You would create 
another table to keep the total counts in.  This would decrease the time taken 
to return the counts at the cost of increasing the time taken to insert and 
delete rows.

Simon.
_______________________________________________
sqlite-users mailing list
sqlite-users@sqlite.org
http://sqlite.org:8080/cgi-bin/mailman/listinfo/sqlite-users

Reply via email to