[ https://issues.apache.org/jira/browse/PHOENIX-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Viraj Jasani updated PHOENIX-7039: ---------------------------------- Description: When PhoenixRecordReader needs to iterate the records from the snapshot restored table, it uses TableSnapshotResultIterator to retrieve the snapshot manifest and the corresponding region manifests from the snapshot. TableSnapshotResultIterator#next initializes ScanningResultIterator using SnapshotScanner, which in turn opens the given region to perform scan. However, this region is opened by a client and not any regionserver and hence if the original region was split or merged, the current region would be holding reference to parent regions in the hbase archive dir. If the region is already removed from meta as well as file system (hbase data dir) after the successful split/merge operations, region initialization by client still leads to the creation of new seqid file in the region's data dir (on WAL filesystem). While the region data is read from the archive dir, due to the region dir creation in hbase data dir, we get a new orphan region with only .seqid file and no store file. At the same time, hbase archive dir still contains the old region dir with reference to parent region. 1. Snapshot creation: {code:java} 2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] snapshot.SnapshotManifest - Storing 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.' region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0 {code} 2. Region getting archived after merge: {code:java} 2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver - Archived from FileableStoreFile, hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 to hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 {code} 3. Region is deleted from meta and file system: {code:java} 2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53 2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53. 2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor - Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure child=53161e6b59b7a2dcdb85b26e676fd72a, parents:[b5d1b622ef045b52aede650db8690d53], [cbf697faee6a0c3eaf8c17e1bf12239a] in 434 msec 2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted merge references in TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a., deleted qualifiers merge0000, merge0001 {code} 4. Snapshot scanner region init {code:java} 2023-09-13 04:06:27,637 INFO [main] org.apache.phoenix.iterate.SnapshotScanner: Creating SnapshotScanner for region: {ENCODED => b5d1b622ef045b52aede650db8690d53, NAME => 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.', STARTKEY => '00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe', ENDKEY => '00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'} {code} 5. Region dir with seqid gets created {code:java} 2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR* completeFile: /hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1 {code} 6. Remaining region init with store init completion: {code:java} 2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1] org.apache.hadoop.hbase.regionserver.HStore: Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore, storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50, encoding=FAST_DIFF, compression=NONE 2023-09-13 04:06:28,439 INFO [main] org.apache.hadoop.hbase.regionserver.HRegion: Opened b5d1b622ef045b52aede650db8690d53; next sequenceid=17042750; SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912, ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920, jitterRate=0.025168776512145996}}}, FlushLargeStoresPolicy{flushSizeLowerBound=-1} {code} While opening region from the client side, we should provide flag to ensure the seqid file is not generated as per HBASE-21977. was: When PhoenixRecordReader needs to iterate the records from the snapshot restored table, it uses TableSnapshotResultIterator to retrieve the snapshot manifest and the corresponding region manifests from the snapshot. TableSnapshotResultIterator#next initializes ScanningResultIterator using SnapshotScanner, which in turn opens the given region to perform scan. However, this region is opened by a client and not any regionserver and hence if the original region was split or merged, the current region would be holding reference to parent regions in the hbase archive dir. If the region is already removed from meta as well as file system after the successful split/merge operations, region initialization by client still leads to the creation of new seqid file in the region's root dir. While the region data is read from the archive dir, due to the region dir creation, we get a new orphan region with only .seqid file and no store file. At the same time, hbase archive dir still contains the old region dir with reference to parent region. 1. Snapshot creation: {code:java} 2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] snapshot.SnapshotManifest - Storing 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.' region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0 {code} 2. Region getting archived after merge: {code:java} 2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver - Archived from FileableStoreFile, hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 to hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 {code} 3. Region is deleted from meta and file system: {code:java} 2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53 2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53. 2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor - Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure child=53161e6b59b7a2dcdb85b26e676fd72a, parents:[b5d1b622ef045b52aede650db8690d53], [cbf697faee6a0c3eaf8c17e1bf12239a] in 434 msec 2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted merge references in TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a., deleted qualifiers merge0000, merge0001 {code} 4. Snapshot scanner region init {code:java} 2023-09-13 04:06:27,637 INFO [main] org.apache.phoenix.iterate.SnapshotScanner: Creating SnapshotScanner for region: {ENCODED => b5d1b622ef045b52aede650db8690d53, NAME => 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.', STARTKEY => '00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe', ENDKEY => '00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'} {code} 5. Region dir with seqid gets created {code:java} 2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR* completeFile: /hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1 {code} 6. Remaining region init with store init completion: {code:java} 2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1] org.apache.hadoop.hbase.regionserver.HStore: Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore, storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50, encoding=FAST_DIFF, compression=NONE 2023-09-13 04:06:28,439 INFO [main] org.apache.hadoop.hbase.regionserver.HRegion: Opened b5d1b622ef045b52aede650db8690d53; next sequenceid=17042750; SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912, ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920, jitterRate=0.025168776512145996}}}, FlushLargeStoresPolicy{flushSizeLowerBound=-1} {code} While opening region from the client side, we should provide flag to ensure the seqid file is not generated as per HBASE-21977. > Snapshot scanner should skip replay WAL and update seqid while opening region > ----------------------------------------------------------------------------- > > Key: PHOENIX-7039 > URL: https://issues.apache.org/jira/browse/PHOENIX-7039 > Project: Phoenix > Issue Type: Bug > Affects Versions: 5.1.3 > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Fix For: 5.2.0, 5.1.4 > > > When PhoenixRecordReader needs to iterate the records from the snapshot > restored table, it uses TableSnapshotResultIterator to retrieve the snapshot > manifest and the corresponding region manifests from the snapshot. > TableSnapshotResultIterator#next initializes ScanningResultIterator using > SnapshotScanner, which in turn opens the given region to perform scan. > However, this region is opened by a client and not any regionserver and hence > if the original region was split or merged, the current region would be > holding reference to parent regions in the hbase archive dir. If the region > is already removed from meta as well as file system (hbase data dir) after > the successful split/merge operations, region initialization by client still > leads to the creation of new seqid file in the region's data dir (on WAL > filesystem). While the region data is read from the archive dir, due to the > region dir creation in hbase data dir, we get a new orphan region with only > .seqid file and no store file. At the same time, hbase archive dir still > contains the old region dir with reference to parent region. > > 1. Snapshot creation: > {code:java} > 2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2] > snapshot.SnapshotManifest - Storing > 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.' > region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0 > {code} > 2. Region getting archived after merge: > {code:java} > 2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver - > Archived from FileableStoreFile, > hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 > > to > hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53 > > {code} > 3. Region is deleted from meta and file system: > {code:java} > 2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted > hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53 > 2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted > TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53. > 2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor - > Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure > child=53161e6b59b7a2dcdb85b26e676fd72a, > parents:[b5d1b622ef045b52aede650db8690d53], > [cbf697faee6a0c3eaf8c17e1bf12239a] in 434 msec > 2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted > merge references in > TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a., > deleted qualifiers merge0000, merge0001 > {code} > 4. Snapshot scanner region init > {code:java} > 2023-09-13 04:06:27,637 INFO [main] > org.apache.phoenix.iterate.SnapshotScanner: > Creating SnapshotScanner for region: > {ENCODED => b5d1b622ef045b52aede650db8690d53, NAME => > 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.', > STARTKEY => > '00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe', > ENDKEY => > '00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'} > {code} > 5. Region dir with seqid gets created > {code:java} > 2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR* > completeFile: > /hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid > is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1 > {code} > 6. Remaining region init with store init completion: > {code:java} > 2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1] > org.apache.hadoop.hbase.regionserver.HStore: > Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore, > storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50, > encoding=FAST_DIFF, compression=NONE > 2023-09-13 04:06:28,439 INFO [main] > org.apache.hadoop.hbase.regionserver.HRegion: > Opened b5d1b622ef045b52aede650db8690d53; > next sequenceid=17042750; > SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912, > ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920, > jitterRate=0.025168776512145996}}}, > FlushLargeStoresPolicy{flushSizeLowerBound=-1} > {code} > While opening region from the client side, we should provide flag to ensure > the seqid file is not generated as per HBASE-21977. -- This message was sent by Atlassian Jira (v8.20.10#820010)