[jira] [Comment Edited] (IGNITE-7167) Optimize 'select count(*) from Table'

2018-02-13 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362003#comment-16362003
 ] 

Vladimir Ozerov edited comment on IGNITE-7167 at 2/13/18 8:50 AM:
--

[~vkulichenko],
Regarding MVCC - when doing {{COUNT\(*\)}} you should count only elements 
visible to your MVCC counter. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).


was (Author: vozerov):
[~vkulichenko],
Regarding MVCC - when doing {{COUNT(*)}} you should count only elements visible 
to your MVCC counter. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).

> Optimize 'select count(*) from Table'
> -
>
> Key: IGNITE-7167
> URL: https://issues.apache.org/jira/browse/IGNITE-7167
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Priority: Major
>
> Currently query like {{select count(*) from Table}} effectively scans the 
> cache and take a lot of time for large datasets. Probably makes sense to 
> optimize it to use {{IgniteCache#size}} directly when possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-7167) Optimize 'select count(*) from Table'

2018-02-13 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362003#comment-16362003
 ] 

Vladimir Ozerov edited comment on IGNITE-7167 at 2/13/18 8:50 AM:
--

[~vkulichenko],
Regarding MVCC - when doing {{COUNT\(*\)}} you should count only elements 
visible to your MVCC version. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuples or blocks), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).


was (Author: vozerov):
[~vkulichenko],
Regarding MVCC - when doing {{COUNT\(*\)}} you should count only elements 
visible to your MVCC version. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).

> Optimize 'select count(*) from Table'
> -
>
> Key: IGNITE-7167
> URL: https://issues.apache.org/jira/browse/IGNITE-7167
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Priority: Major
>
> Currently query like {{select count(*) from Table}} effectively scans the 
> cache and take a lot of time for large datasets. Probably makes sense to 
> optimize it to use {{IgniteCache#size}} directly when possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-7167) Optimize 'select count(*) from Table'

2018-02-13 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362003#comment-16362003
 ] 

Vladimir Ozerov edited comment on IGNITE-7167 at 2/13/18 8:50 AM:
--

[~vkulichenko],
Regarding MVCC - when doing {{COUNT\(*\)}} you should count only elements 
visible to your MVCC version. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).


was (Author: vozerov):
[~vkulichenko],
Regarding MVCC - when doing {{COUNT\(*\)}} you should count only elements 
visible to your MVCC counter. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).

> Optimize 'select count(*) from Table'
> -
>
> Key: IGNITE-7167
> URL: https://issues.apache.org/jira/browse/IGNITE-7167
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Priority: Major
>
> Currently query like {{select count(*) from Table}} effectively scans the 
> cache and take a lot of time for large datasets. Probably makes sense to 
> optimize it to use {{IgniteCache#size}} directly when possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-7167) Optimize 'select count(*) from Table'

2018-02-13 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16362003#comment-16362003
 ] 

Vladimir Ozerov edited comment on IGNITE-7167 at 2/13/18 8:49 AM:
--

[~vkulichenko],
Regarding MVCC - when doing {{COUNT(*)}} you should count only elements visible 
to your MVCC counter. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).


was (Author: vozerov):
[~vkulichenko],
Regarding MVCC - when doing COUNT(*) you should count only elements visible to 
your MVCC counter. The only way to achieve this is counting elements 
one-by-one, filtering out the following entries:
1) Entries for not-yet committed transactions
2) Entries for aborted transactions
3) Entries for newer committed transactions which are not visible to current 
transaction

Certain optimizations exist, such as aggregating visibility info on per-block 
level, but in general case we still resort to a kind of iteration over some 
elements (tuple or block), rather than reading a single number.

NB: When MVCC is enabled {{IgniteCache.size()}} would also likely be O(N) 
operation rather than O(1).

> Optimize 'select count(*) from Table'
> -
>
> Key: IGNITE-7167
> URL: https://issues.apache.org/jira/browse/IGNITE-7167
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>Priority: Major
>
> Currently query like {{select count(*) from Table}} effectively scans the 
> cache and take a lot of time for large datasets. Probably makes sense to 
> optimize it to use {{IgniteCache#size}} directly when possible.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (IGNITE-7167) Optimize 'select count(*) from Table'

2017-12-15 Thread Vladimir Ozerov (JIRA)

[ 
https://issues.apache.org/jira/browse/IGNITE-7167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16292205#comment-16292205
 ] 

Vladimir Ozerov edited comment on IGNITE-7167 at 12/15/17 8:43 AM:
---

We already optimized this, see IGNITE-6702. We cannot get rid of scan in 
general case. For now we iterate over index entries and count them. This is the 
best what can be done. 
However, when MVCC is ready, we will have to resort to (almost) original 
approach - for every entry in the index we would have to check whether it is 
visible.


was (Author: vozerov):
We already optimized this, see IGNITE-6702. We cannot get rid of scan in 
general case. For now we iterate over index entries and count them. This is the 
best what can be done. 

However, when MVCC is ready, we will have to resort to (almost) original 
approach - for every entry in the index we would have to check whether it is 
visible.

> Optimize 'select count(*) from Table'
> -
>
> Key: IGNITE-7167
> URL: https://issues.apache.org/jira/browse/IGNITE-7167
> Project: Ignite
>  Issue Type: Improvement
>  Components: sql
>Affects Versions: 2.3
>Reporter: Valentin Kulichenko
>
> Currently query like {{select count(*) from Table}} effectively scans the 
> cache and take a lot of time for large datasets. Probably makes sense to 
> optimize it to use {{IgniteCache#size}} directly when possible.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)