[ https://issues.apache.org/jira/browse/IGNITE-11749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy Govorukhin resolved IGNITE-11749. ----------------------------------------- Resolution: Fixed > Implement automatic pages history dump on CorruptedTreeException > ---------------------------------------------------------------- > > Key: IGNITE-11749 > URL: https://issues.apache.org/jira/browse/IGNITE-11749 > Project: Ignite > Issue Type: Improvement > Reporter: Alexey Goncharuk > Assignee: Anton Kalashnikov > Priority: Major > Fix For: 2.8 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently, the only way to debug possible bugs in checkpointer/recovery > mechanics is to manually parse WAL files after the corruption happened. This > is not practical for several reasons. First, it requires manual actions which > depend on the content of the exception. Second, it is not always possible to > obtain WAL files (it may contain sensitive data). > We need to add a mechanics which will dump all information required for > primary analysis of the corruption to the exception handler. For example, if > an exception happened when materializing a link {{0xabcd}} written on an > index page {{0xdcba}}, we need to dump history of both pages changes, > checkpoint records on the analysis interval. Possibly, we should include > FreeList pages to which the aforementioned pages were included to. > Example of output: > {noformat} > [2019-05-07 11:57:57,350][INFO > ][test-runner-#58%diagnostic.DiagnosticProcessorTest%][PageHistoryDiagnoster] > Next WAL record :: PageSnapshot [fullPageId = FullPageId > [pageId=0002ffff00000000, effectivePageId=0000ffff00000000, > grpId=-2100569601], page = [ > Header [ > type=11 (PageMetaIO), > ver=1, > crc=0, > pageId=844420635164672(offset=0, flags=10, partId=65535, index=0) > ], > PageMeta[ > treeRoot=844420635164675, > lastSuccessfulFullSnapshotId=0, > lastSuccessfulSnapshotId=0, > nextSnapshotTag=1, > lastSuccessfulSnapshotTag=0, > lastAllocatedPageCount=0, > candidatePageCount=0 > ]], > super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, > fileOff=103, len=4129], type=PAGE_RECORD]]] > Next WAL record :: CheckpointRecord > [cpId=c6ba7793-113b-4b54-8530-45e1708ca44c, end=false, cpMark=FileWALPointer > [idx=0, fileOff=29, len=29], super=WALRecord [size=1963, chainSize=0, > pos=FileWALPointer [idx=0, fileOff=39686, len=1963], type=CHECKPOINT_RECORD]] > Next WAL record :: PageSnapshot [fullPageId = FullPageId > [pageId=0002ffff00000000, effectivePageId=0000ffff00000000, > grpId=-1368047378], page = [ > Header [ > type=11 (PageMetaIO), > ver=1, > crc=0, > pageId=844420635164672(offset=0, flags=10, partId=65535, index=0) > ], > PageMeta[ > treeRoot=844420635164675, > lastSuccessfulFullSnapshotId=0, > lastSuccessfulSnapshotId=0, > nextSnapshotTag=1, > lastSuccessfulSnapshotTag=0, > lastAllocatedPageCount=0, > candidatePageCount=0 > ]], > super = [WALRecord [size=4129, chainSize=0, pos=FileWALPointer [idx=0, > fileOff=55961, len=4129], type=PAGE_RECORD]]] > Next WAL record :: CheckpointRecord > [cpId=145e599e-66fc-45f5-bde4-b0c392125968, end=false, cpMark=null, > super=WALRecord [size=21409, chainSize=0, pos=FileWALPointer [idx=0, > fileOff=13101788, len=21409], type=CHECKPOINT_RECORD]] > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)