[ https://issues.apache.org/jira/browse/IGNITE-17157 ]
Julia Bakulina deleted comment on IGNITE-17157:
-----------------------------------------
was (Author: JIRAUSER294860):
[~NIzhikov] hi! Could you please review?
> Documentation of the Ignite index reader
> ----------------------------------------
>
> Key: IGNITE-17157
> URL: https://issues.apache.org/jira/browse/IGNITE-17157
> Project: Ignite
> Issue Type: Task
> Reporter: Denis Chudov
> Assignee: Julia Bakulina
> Priority: Major
> Labels: documentation, ise
> Time Spent: 40m
> Remaining Estimate: 0h
>
> It would be nice to have a documentation for the Ignite index reader utility
> that was added in IGNITE-14529.
> {panel:title=Draft}
> // Here should also be an overview with the description of the purposes of
> the utility
> To run this utility, use index-reader.sh/index-reader.bat script from Ignite
> *bin* directory.
> *Command line parameters:*
> *--dir*: partition directory, where index.bin and (optionally) partition
> files are located.
> *--part-cnt*: full partitions count in cache group. Default value: 0
> *--page-size*: page size. Default value: 4096
> *--page-store-ver*: page store version. Default value: 2
> *--indexes*: you can specify index tree names that will be processed,
> separated by comma without spaces, other index trees will be skipped. Default
> value: null. Index tree names are not the same as index names, they have
> format _cacheId_typeId_indexName##H2Tree%segmentNumber_, e.g.
> {{2652_885397586_T0_F0_F1_IDX##H2Tree%0}}. You can see them in utility
> output, in traversal information sections (RECURSIVE and HORIZONTAL).
> *--dest-file*: file to print the report to (by default report is printed to
> console). Default value: null
> *--check-parts*: check cache data tree in partition files and it's
> consistency with indexes. Default value: false
> The utility can analyze index.bin and optionally partitions, if *--part-cnt*
> greater that 0 and partition files are present, to read CacheDataTree and to
> look into data pages to check their availability. It reads all index trees
> from index.bin and traverses them in two ways:
> - recursive traversal from root to leaves
> - traversal by each level, as all pages on one level are connected through
> forward ids.
> Also it reads page reuse lists. After all, it scans all pages in file, trying
> to detect orphan pages (those which don’t have any references from index
> trees and reuse lists).
> So, the output of the IgniteIndexReader consists of 4 main sections:
> - recursive traversal info (with prefix <RECURSIVE>)
> - horizontal traversal info (with prefix <HORIZONTAL>)
> - page reuse lists info (with prefix <PAGE_LIST>)
> - sequential scan of all pages.
> Optionally, with *--check-parts* parameter, it can have information about how
> CacheDataTree matches SQL indexes. If there are no errors, then there is only
> message like this:
> {noformat}
> Partitions check detected no errors.
> Partition check finished, total errors: 0, total problem partitions: 0
> {noformat}
> Otherwise, there is “Partitions check:“ section with list of errors. For
> example, this is how looks message about the entry that was found in
> CacheDataTree, but was not found in SQL indexes:
> {noformat}
> <ERROR> Errors detected in partition, partId=1023
> <ERROR> Entry is missing in index: I
> [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0, pageId=0002ffff0000000d],
> cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> <ERROR> Entry is missing in index: I
> [idxName=2652_885397586_T0_F2_IDX##H2Tree%0, pageId=0002ffff0000000b],
> cacheId=2652, partId=1023, pageIndex=8, itemId=0, link=285868728254472
> All errors in the output have prefix <ERROR>.
> {noformat}
> h3. Command line examples
> Analyze files from /gridgain/corrupted_idxs, there should be also 1024
> partitions in this cache group (some of partition files can be missing if
> node where they have been received from was not owning these partitions), use
> pageSize=4096 and page store version 2, report goes to report.txt:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024
> --page-size 4096 --page-store-ver 2 --dest-file "report.txt"
> {noformat}
> Read only SQL indexes:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --dest-file "report.txt"
> {noformat}
> Read SQL indexes and check cache data tree in partitions:
> {noformat}
> ./index-reader.sh --dir "/gridgain/corrupted_idxs" --part-cnt 1024
> --check-parts --dest-file "rep
> {noformat}
> h3. Output samples
> <RECURSIVE> and <HORIZONTAL> output sections contain information about index
> trees: tree name, root page id, page type statistics, count of items. The
> format for both traversals is the same.
> {noformat}
> <RECURSIVE> Index tree: I [idxName=2654_-1177891018__key_PK##H2Tree%0,
> pageId=0202ffff00000066]
> <RECURSIVE> -- Page stat:
> <RECURSIVE> H2ExtrasLeafIO: 2
> <RECURSIVE> H2ExtrasInnerIO: 1
> <RECURSIVE> BPlusMetaIO: 1
> <RECURSIVE> -- Count of items found in leaf pages: 200
> <RECURSIVE> No errors occurred while traversing.
> ...
> <RECURSIVE> Total trees: 19
> <RECURSIVE> Total pages found in trees: 49
> <RECURSIVE> Total errors during trees traversal: 2
> {noformat}
> Page lists section also contains reuse list bucket data with list meta,
> bucket number and start pages of lists found in bucket. It also contains page
> type statistics:
> {noformat}
> <PAGE_LIST> Page lists info.
> <PAGE_LIST> ---Printing buckets data:
> <PAGE_LIST> List meta id=844420635164675, bucket number=0,
> lists=[844420635164687]
> <PAGE_LIST> -- Page stat:
> <PAGE_LIST> H2ExtrasLeafIO: 32
> <PAGE_LIST> H2ExtrasInnerIO: 1
> <PAGE_LIST> BPlusMetaIO: 1
> <PAGE_LIST> ---No errors.
> {noformat}
> So does the sequential scan info:
> {noformat}
> ---These pages types were encountered during sequential scan:
> H2ExtrasLeafIO: 165
> H2ExtrasInnerIO: 19
> PagesListNodeIO: 1
> PagesListMetaIO: 1
> MetaStoreLeafIO: 5
> BPlusMetaIO: 20
> PageMetaIO: 1
> MetaStoreInnerIO: 1
> TrackingPageIO: 1
> ---
> Total pages encountered during sequential scan: 214
> Total errors occurred during sequential scan: 0
> {noformat}
> Index reader compares the results of both traversals and sizes of indexes of
> same caches, so you should just be aware of errors. E.g. error message about
> index size inconsistency looks like this:
> {noformat}
> <ERROR> Index size inconsistency: cacheId=2652, typeId=885397586
> <ERROR> Index name: I [idxName=2652_885397586_T0_F0_F1_IDX##H2Tree%0,
> pageId=0002ffff0000000d], size=1700
> <ERROR> Index name: I [idxName=2652_885397586__key_PK##H2Tree%0,
> pageId=0002ffff00000005], size=0
> <ERROR> Index name: I [idxName=2652_885397586_T0_F1_IDX##H2Tree%0,
> pageId=0002ffff00000009], size=1700
> <ERROR> Index name: I [idxName=2652_885397586_T0_F0_IDX##H2Tree%0,
> pageId=0002ffff00000007], size=1700
> <ERROR> Index name: I [idxName=2652_885397586_T0_F2_IDX##H2Tree%0,
> pageId=0002ffff0000000b]
> {noformat}
> {panel}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)