[
https://issues.apache.org/jira/browse/PHOENIX-7039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Viraj Jasani resolved PHOENIX-7039.
-----------------------------------
Resolution: Fixed
> Snapshot scanner should skip replay WAL and update seqid while opening region
> -----------------------------------------------------------------------------
>
> Key: PHOENIX-7039
> URL: https://issues.apache.org/jira/browse/PHOENIX-7039
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.1.3
> Reporter: Viraj Jasani
> Assignee: Viraj Jasani
> Priority: Major
> Fix For: 5.2.0, 5.1.4
>
>
> When PhoenixRecordReader needs to iterate the records from the snapshot
> restored table, it uses TableSnapshotResultIterator to retrieve the snapshot
> manifest and the corresponding region manifests from the snapshot.
> TableSnapshotResultIterator#next initializes ScanningResultIterator using
> SnapshotScanner, which in turn opens the given region to perform scan.
> However, this region is opened by a client and not any regionserver and hence
> if the original region was split or merged, the current region would be
> holding reference to parent regions in the hbase archive dir. If the region
> is already removed from meta as well as file system (hbase data dir) after
> the successful split/merge operations, region initialization by client still
> leads to the creation of new seqid file in the region's data dir (on WAL
> filesystem). While the region data is read from the archive dir, due to the
> region dir creation in hbase data dir, we get a new orphan region with only
> .seqid file and no store file. At the same time, hbase archive dir still
> contains the old region dir with reference to parent region.
>
> 1. Snapshot creation:
> {code:java}
> 2023-09-13 01:01:50,103 DEBUG [557)-snapshot-pool-2]
> snapshot.SnapshotManifest - Storing
> 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.'
> region-info for snapshot=SNAPSHOT_TABLE1_1694566851085_1694566876390_0
> {code}
> 2. Region getting archived after merge:
> {code:java}
> 2023-09-13 02:46:58,177 DEBUG [gionserver-4:60020-8] backup.HFileArchiver -
> Archived from FileableStoreFile,
> hdfs://cluster1/hbase/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
>
> to
> hdfs://cluster1/hbase/archive/data/default/TABLE1/53161e6b59b7a2dcdb85b26e676fd72a/0/aa5058a23c024463bb33bbb2abc68577.b5d1b622ef045b52aede650db8690d53
>
> {code}
> 3. Region is deleted from meta and file system:
> {code:java}
> 2023-09-13 02:50:26,054 DEBUG [PEWorker-53] backup.HFileArchiver - Deleted
> hdfs://cluster1/hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53
> 2023-09-13 02:50:26,123 INFO [PEWorker-53] hbase.MetaTableAccessor - Deleted
> TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.
> 2023-09-13 02:50:26,340 INFO [PEWorker-58] procedure2.ProcedureExecutor -
> Finished pid=1006984, state=SUCCESS; GCMultipleMergedRegionsProcedure
> child=53161e6b59b7a2dcdb85b26e676fd72a,
> parents:[b5d1b622ef045b52aede650db8690d53],
> [cbf697faee6a0c3eaf8c17e1bf12239a] in 434 msec
> 2023-09-13 02:50:26,269 INFO [PEWorker-58] hbase.MetaTableAccessor - Deleted
> merge references in
> TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1685345080046.53161e6b59b7a2dcdb85b26e676fd72a.,
> deleted qualifiers merge0000, merge0001
> {code}
> 4. Snapshot scanner region init
> {code:java}
> 2023-09-13 04:06:27,637 INFO [main]
> org.apache.phoenix.iterate.SnapshotScanner:
> Creating SnapshotScanner for region:
> {ENCODED => b5d1b622ef045b52aede650db8690d53, NAME =>
> 'TABLE1,00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe,1684558830177.b5d1b622ef045b52aede650db8690d53.',
> STARTKEY =>
> '00DAG00000005sXa07\x80\x00\x01\x87p/\xB38a07AG000017Kx7Z017AG00002j9Jxe',
> ENDKEY =>
> '00DAG00000005sXa07\x80\x00\x01\x87\x80\x02P@a07AG0000183cN3017AG00002lPrRe'}
> {code}
> 5. Region dir with seqid gets created
> {code:java}
> 2023-09-13 04:06:28,431 INFO [on default port 9000] hdfs.StateChange - DIR*
> completeFile:
> /hbase/data/default/TABLE1/b5d1b622ef045b52aede650db8690d53/recovered.edits/17042749.seqid
> is closed by DFSClient_attempt_1692995189831_25389_m_000797_0_-1558517803_1
> {code}
> 6. Remaining region init with store init completion:
> {code:java}
> 2023-09-13 04:06:28,354 INFO [StoreOpener-b5d1b622ef045b52aede650db8690d53-1]
> org.apache.hadoop.hbase.regionserver.HStore:
> Store=b5d1b622ef045b52aede650db8690d53/0, memstore type=DefaultMemStore,
> storagePolicy=HOT, verifyBulkLoads=false, parallelPutCountPrintThreshold=50,
> encoding=FAST_DIFF, compression=NONE
> 2023-09-13 04:06:28,439 INFO [main]
> org.apache.hadoop.hbase.regionserver.HRegion:
> Opened b5d1b622ef045b52aede650db8690d53;
> next sequenceid=17042750;
> SteppingSplitPolicysuper{IncreasingToUpperBoundRegionSplitPolicy{initialSize=536870912,
> ConstantSizeRegionSplitPolicy{desiredMaxFileSize=11007665920,
> jitterRate=0.025168776512145996}}},
> FlushLargeStoresPolicy{flushSizeLowerBound=-1}
> {code}
> While opening region from the client side, we should provide flag to ensure
> the seqid file is not generated as per HBASE-21977.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)