[
https://issues.apache.org/jira/browse/IGNITE-6057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vladimir Ozerov updated IGNITE-6057:
------------------------------------
Labels: performance (was: iep-1 performance)
> SQL: Full scan should be performed through data pages bypassing primary index
> -----------------------------------------------------------------------------
>
> Key: IGNITE-6057
> URL: https://issues.apache.org/jira/browse/IGNITE-6057
> Project: Ignite
> Issue Type: Task
> Components: persistence, sql
> Affects Versions: 2.1
> Reporter: Vladimir Ozerov
> Labels: iep-1, performance
>
> Currently both SQL full scan and {{CREATE INDEX}} commands iterate through
> primary index to get all existing values. Consider that we have 10 entries
> per data page on average. In this case we will have to read the same data
> page 10 times when reaching relevant keys in different parts of index tree.
> This could be very inefficient on certain workloads.
> We should iterate over data pages directly instead. This way a page with 10
> entries will be accessed only once. However, we should take cache groups in
> count - if there are too many entries from other logical caches, this
> approach could make situation even worse, unless we have a mechanism to skip
> unnecessary entries (or the whole pages!) efficiently.
> Probably we should develop a cost-based model, which will take in count the
> following statistics:
> 1) Average entry size. The longer the entry, the lesser the benefit.
> Especially if overflow pages are used frequently.
> 2) Cache groups. Ideally, we should estimate number of entries from all
> logical caches. The more entries from other caches, the lesser the benefit.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)