[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Martin Hajduch (JIRA) Wed, 05 Nov 2008 13:39:36 -0800

    [ 
https://issues.apache.org/jira/browse/DERBY-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645333#action_12645333
 ]


Martin Hajduch commented on DERBY-3937:
---------------------------------------

Ok, in the meantime I read more about the topic and Derby.

So back to my original question - is there any alternative to COUNT(*) and 
implementing of paging over ordered set of (milion) rows with constant 
performance towards 'last' pages ?
If not at the moment - is there any development planned in this area (where I 
could possibly contribute as well) ?
If not, is there any development planned in the area of speeding up the index 
scan so that milion rows do not take 10, but let's say 1 second ? Even such 
constant improvement would help in my particular case.

I tried several open source databases, found Derby to be the closest to my 
needs and I am able to devote some resources in this direction - but would like 
to get some pointers towards what to concentrate on to.

> Select count(*) scans all the rows (and is therefore slow with big tables), 
> is the amount of rows not available/known for example in index ?
> --------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-3937
>                 URL: https://issues.apache.org/jira/browse/DERBY-3937
>             Project: Derby
>          Issue Type: Improvement
>          Components: Performance
>         Environment: Any
>            Reporter: Martin Hajduch
>
> Create table with 5000000 rows. Create index on unique ID. Select count(*) on 
> such table is going to take quite some time.
> Shouldn't the index contain amount of indexed rows and the value taken from 
> there ?
> Additionally, queries of the form select count(*) from table where 
> col1=value; take lots of time (depending on amount of rows satisfying WHERE 
> clause) even if index on col1 exists. Isn't it possible to find first and 
> last occurence in the index, and then calculate amount of rows more 
> effectively then scanning through all of them ?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (DERBY-3937) Select count(*) scans all the rows (and is therefore slow with big tables), is the amount of rows not available/known for example in index ?

Reply via email to