[jira] [Commented] (IGNITE-7167) Optimize 'select count(*) from Table'
[ https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362003#comment-16362003 ] Vladimir Ozerov commented on IGNITE-7167: - [~vkulichenko], Regarding MVCC - when doing COUNT(*) you should count only elements visible to your MVCC counter. The only way to achieve this is counting elements one-by-one, filtering out the following entries: 1) Entries for not-yet committed transactions 2) Entries for aborted transactions 3) Entries for newer committed transactions which are not visible to current transaction Certain optimizations exist, such as aggregating visibility info on per-block level, but in general case we still resort to a kind of iteration over some elements (tuple or block), rather than reading a single number. NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) operation rather than O(1). > Optimize 'select count(*) from Table' > - > > Key: IGNITE-7167 > URL: https://issues.apache.org/jira/browse/IGNITE-7167 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > > Currently query like {{select count(*) from Table}} effectively scans the > cache and take a lot of time for large datasets. Probably makes sense to > optimize it to use {{IgniteCache#size}} directly when possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7167) Optimize 'select count(*) from Table'
[ https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16361366#comment-16361366 ] Valentin Kulichenko commented on IGNITE-7167: - [~vozerov], # But we're going to disallow this eventually, right? For now, is it possible to track this somehow and use {{size()}} when possible? Having several types in a cache is very rare now, so doing scan in this case is OK. # Can you elaborate on this? What exactly doesn't work? In general, if {{size()}} is not an appropriate solution, maybe there is another one? From my expirience with other DBs, this query never takes a lot of time even for large tables. And this seems to cause confusion for our users as well. Would be great if we come up with something here, even if not right now. > Optimize 'select count(*) from Table' > - > > Key: IGNITE-7167 > URL: https://issues.apache.org/jira/browse/IGNITE-7167 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > > Currently query like {{select count(*) from Table}} effectively scans the > cache and take a lot of time for large datasets. Probably makes sense to > optimize it to use {{IgniteCache#size}} directly when possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7167) Optimize 'select count(*) from Table'
[ https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16360409#comment-16360409 ] Vladimir Ozerov commented on IGNITE-7167: - [~vkulichenko], {{IgniteCache.size}} is not appropriate for two reasons: 1) We still allow for multiple tables in the same cache, so {{size()}} will return number of entries from all tables 2) It doesn't work for MVCC case > Optimize 'select count(*) from Table' > - > > Key: IGNITE-7167 > URL: https://issues.apache.org/jira/browse/IGNITE-7167 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > > Currently query like {{select count(*) from Table}} effectively scans the > cache and take a lot of time for large datasets. Probably makes sense to > optimize it to use {{IgniteCache#size}} directly when possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (IGNITE-7167) Optimize 'select count(*) from Table'
[ https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354756#comment-16354756 ] Valentin Kulichenko commented on IGNITE-7167: - [~vozerov], I actually meant the specific case when query exactly like in the title is executed, without any WHERE clauses or anything else. Why can't we just call {{IgniteCache#size}} in this case? > Optimize 'select count(*) from Table' > - > > Key: IGNITE-7167 > URL: https://issues.apache.org/jira/browse/IGNITE-7167 > Project: Ignite > Issue Type: Improvement > Components: sql >Affects Versions: 2.3 >Reporter: Valentin Kulichenko >Priority: Major > > Currently query like {{select count(*) from Table}} effectively scans the > cache and take a lot of time for large datasets. Probably makes sense to > optimize it to use {{IgniteCache#size}} directly when possible. -- This message was sent by Atlassian JIRA (v7.6.3#76005)