[ 
https://issues.apache.org/jira/browse/IGNITE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vladimir Ozerov closed IGNITE-6057.
-----------------------------------

> SQL: Full scan should be performed through data pages bypassing primary index
> -----------------------------------------------------------------------------
>
>                 Key: IGNITE-6057
>                 URL: https://issues.apache.org/jira/browse/IGNITE-6057
>             Project: Ignite
>          Issue Type: Task
>          Components: persistence, sql
>    Affects Versions: 2.1
>            Reporter: Vladimir Ozerov
>              Labels: performance
>
> Currently both SQL full scan and {{CREATE INDEX}} commands iterate through 
> primary index to get all existing values. Consider that we have 10 entries 
> per data page on average. In this case we will have to read the same data 
> page 10 times when reaching relevant keys in different parts of index tree. 
> This could be very inefficient on certain workloads.
> We should iterate over data pages directly instead. This way a page with 10 
> entries will be accessed only once. However, we should take cache groups in 
> count - if there are too many entries from other logical caches, this 
> approach could make situation even worse, unless we have a mechanism to skip 
> unnecessary entries (or the whole pages!) efficiently.
> Probably we should develop a cost-based model, which will take in count the 
> following statistics:
> 1) Average entry size. The longer the entry, the lesser the benefit. 
> Especially if overflow pages are used frequently. 
> 2) Cache groups. Ideally, we should estimate number of entries from all 
> logical caches. The more entries from other caches, the lesser the benefit.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to