Vladimir Ozerov created IGNITE-6057:
---------------------------------------
Summary: SQL: Full scan should be performed through data pages
bypassing primary index
Key: IGNITE-6057
URL: https://issues.apache.org/jira/browse/IGNITE-6057
Project: Ignite
Issue Type: Improvement
Components: persistence, sql
Affects Versions: 2.1
Reporter: Vladimir Ozerov
Fix For: 2.2
Currently both SQL full scan and {{CREATE INDEX}} commands iterate through
primary index to get all existing values. Consider that we have 10 entries per
data page on average. In this case we will have to read the same data page 10
times when reaching relevant keys in different parts of index tree. This could
be very inefficient on certain workloads.
We should iterate over data pages directly instead. This way a page with 10
entries will be accessed only once. However, we should take cache groups in
count - if there are too many entries from other logical caches, this approach
could make situation even worse, unless we have a mechanism to skip unnecessary
entries (or the whole pages!) efficiently.
Probably we should develop a cost-based model, which will take in count the
following statistics:
1) Average entry size. The longer the entry, the lesser the benefit. Especially
if overflow pages are used frequently.
2) Cache groups. Ideally, we should estimate number of entries from all logical
caches. The more entries from other caches, the lesser the benefit.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)