[
https://issues.apache.org/jira/browse/IGNITE-11998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17541980#comment-17541980
]
Maxim Muzafarov commented on IGNITE-11998:
------------------------------------------
h4. The inital proposal
Currently, during a full scan of a cache group partition (SqlQuery or
ScanQuery) all the data is read though the partition B-Tree and this in turn
leads to the _n(log n)_ complexity. For such a queries it may be necessary to
read all the data by sequential pages read directly from the partition file
which has the _n_ complexity and also the sequential file reads has some
benefits instead of random access file reads.
h4. The main issue
Accoring to the [Ignite Multi-Tier Storage - under the
hood|https://cwiki.apache.org/confluence/display/IGNITE/Ignite+Multi-Tier+Storage+-+under+the+hood#IgniteMultiTierStorageunderthehood-Longobjects]
long objects are splitted on the several pages. For the pages which are
contain an entry tail there is no any dedicated page attribute or page header
flag to identify such a pages, however, such a pages have a link to an other
fragment or a entry head. These pages may only be accessed from the page which
contain the entry head.
h4. Current solution and benchmarks
_The double loop over the all partition pages. _
During the first loop we are reading all the pages and collecting references to
the other pages (reading entries are performed from the head to tail, writing
entries are preformed from the tail to head). On the second loop we are
building the list of pages that doesn't have a references on itself - and these
are the pages that containing the entries headers to be read.
||Data Page Scan||true||false||
|IgniteDataPageScanBenchmark|148848|179228|
|IgniteDataPageScanBenchmark|186917|166980|
|IgniteDataPageScanBenchmark|197114|175667|
h4. Possible solutions
An additional analysis and investigation required to perform the full partition
scan using only the one loop. We need to identify the fragmented pages with
entries tails:
- for such a pages we can write the {{freeSpace}}, {{directCounter}},
{{indirectCounter}} e.g. {{-1}} value (currently it's zero) and here we need
check the pds compatibility.
- almost the same issue with identifying fragmented pages are here -
IGNITE-12510
> Fix DataPageScan for fragmented pages.
> --------------------------------------
>
> Key: IGNITE-11998
> URL: https://issues.apache.org/jira/browse/IGNITE-11998
> Project: Ignite
> Issue Type: Bug
> Reporter: Ivan Bessonov
> Assignee: Maxim Muzafarov
> Priority: Critical
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Fragmented pages crash JVM when accessed by DataPageScan scanner/query
> optimized scanner. It happens when scanner accesses data in later chunk in
> fragmented entry but treats it like the first one, expecting length of the
> payload, which is absent and replaced with raw entry data.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)