[ 
https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmitriy Govorukhin resolved IGNITE-11749.
-----------------------------------------
    Resolution: Fixed

> Implement automatic pages history dump on CorruptedTreeException
> ----------------------------------------------------------------
>
>                 Key: IGNITE-11749
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11749
>             Project: Ignite
>          Issue Type: Improvement
>            Reporter: Alexey Goncharuk
>            Assignee: Anton Kalashnikov
>            Priority: Major
>             Fix For: 2.8
>
>          Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently, the only way to debug possible bugs in checkpointer/recovery 
> mechanics is to manually parse WAL files after the corruption happened. This 
> is not practical for several reasons. First, it requires manual actions which 
> depend on the content of the exception. Second, it is not always possible to 
> obtain WAL files (it may contain sensitive data).
> We need to add a mechanics which will dump all information required for 
> primary analysis of the corruption to the exception handler. For example, if 
> an exception happened when materializing a link {{0xabcd}} written on an 
> index page {{0xdcba}}, we need to dump history of both pages changes, 
> checkpoint records on the analysis interval. Possibly, we should include 
> FreeList pages to which the aforementioned pages were included to.
> Example of output:
> {noformat}
> [2019-05-07 11:57:57,350][INFO 
> ][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] 
> Next WAL record :: PageSnapshot [fullPageId = FullPageId 
> [pageId=0002ffff00000000, effectivePageId=0000ffff00000000, 
> grpId=-2100569601], page = [
> Header [
>       type=11 (PageMetaIO),
>       ver=1,
>       crc=0,
>       pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
> ],
> PageMeta[
>       treeRoot=844420635164675,
>       lastSuccessfulFullSnapshotId=0,
>       lastSuccessfulSnapshotId=0,
>       nextSnapshotTag=1,
>       lastSuccessfulSnapshotTag=0,
>       lastAllocatedPageCount=0,
>       candidatePageCount=0
> ]],
> super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
> fileOff=103, len=4129], type=PAGE_RECORD]]]
> Next WAL record :: CheckpointRecord 
> [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, end=false, cpMark=FileWALPointer 
> [idx=0, fileOff=29, len=29], super=WALRecord [size=1963, chainSize=0, 
> pos=FileWALPointer [idx=0, fileOff=39686, len=1963], type=CHECKPOINT_RECORD]]
> Next WAL record :: PageSnapshot [fullPageId = FullPageId 
> [pageId=0002ffff00000000, effectivePageId=0000ffff00000000, 
> grpId=-1368047378], page = [
> Header [
>       type=11 (PageMetaIO),
>       ver=1,
>       crc=0,
>       pageId=844420635164672(offset=0, flags=10, partId=65535, index=0)
> ],
> PageMeta[
>       treeRoot=844420635164675,
>       lastSuccessfulFullSnapshotId=0,
>       lastSuccessfulSnapshotId=0,
>       nextSnapshotTag=1,
>       lastSuccessfulSnapshotTag=0,
>       lastAllocatedPageCount=0,
>       candidatePageCount=0
> ]],
> super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, 
> fileOff=55961, len=4129], type=PAGE_RECORD]]]
> Next WAL record :: CheckpointRecord 
> [cpId=145e599e-66fc-45f5-bde4-b0c392125968, end=false, cpMark=null, 
> super=WALRecord [size=21409, chainSize=0, pos=FileWALPointer [idx=0, 
> fileOff=13101788, len=21409], type=CHECKPOINT_RECORD]]
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to