[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Knut Anders Hatlen (JIRA) Fri, 07 Nov 2008 04:53:09 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645753#action_12645753
 ]


Knut Anders Hatlen commented on DERBY-3937:
-------------------------------------------

> So back to my original question - is there any alternative to
> COUNT(*) and implementing of paging over ordered set of (milion)
> rows with constant performance towards 'last' pages ?

One alternative is to create insert and delete triggers on the table
and maintain the row count in a separate table. For example:

ij> create table t(x int);
0 rows inserted/updated/deleted
ij> create table t_row_count(row_count int);
0 rows inserted/updated/deleted
ij> insert into t_row_count values 0;
1 row inserted/updated/deleted
ij> create trigger t_insert after insert on t for each row update t_row_count 
set row_count = row_count + 1;
0 rows inserted/updated/deleted
ij> create trigger t_delete after delete on t for each row update t_row_count 
set row_count = row_count - 1;
0 rows inserted/updated/deleted
ij> insert into t values 1,2,3;
3 rows inserted/updated/deleted
ij> select count(*) from t;
1          
-----------
3          

1 row selected
ij> select row_count from t_row_count;
ROW_COUNT  
-----------
3          

1 row selected
ij> insert into t values 4,5;
2 rows inserted/updated/deleted
ij> select row_count from t_row_count;
ROW_COUNT  
-----------
5          

1 row selected
ij> delete from t where x > 2;
3 rows inserted/updated/deleted
ij> select row_count from t_row_count;
ROW_COUNT  
-----------
2          

1 row selected
ij> select count(*) from t;
1          
-----------
2          

1 row selected

> Select count(*) scans all the rows (and is therefore slow with big tables), 
> is the amount of rows not available/known for example in index ?
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3937
>                 URL: https://issues.apache.org/jira/browse/DERBY-3937
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance
>         Environment: Any
>            Reporter: Martin Hajduch
>
> Create table with 5000000 rows. Create index on unique ID. Select count(*) on 
> such table is going to take quite some time.
> Shouldn't the index contain amount of indexed rows and the value taken from 
> there ?
> Additionally, queries of the form select count(*) from table where 
> col1=value; take lots of time (depending on amount of rows satisfying WHERE 
> clause) even if index on col1 exists. Isn't it possible to find first and 
> last occurence in the index, and then calculate amount of rows more 
> effectively then scanning through all of them ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Reply via email to