[
https://issues.apache.org/jira/browse/HBASE-27698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17709991#comment-17709991
]
Rajeshbabu Chintaguntla edited comment on HBASE-27698 at 4/9/23 8:09 PM:
-------------------------------------------------------------------------
[~vjasani] [~taklwu]
During express upgrade mainly from HBase 1.x to HBase 2.x as I have mentioned
in the above comment region states info for hbase:meta won't be identified
properly as neither the zookeeper not WALs files have the meta location because
no RS holding the meta location then it would be better to proceed with the
meta assignment instead of throwing exception when the hbase:meta directory is
not partial.
With this change of not throwing exception(Raised PR
[https://github.com/apache/hbase/pull/5167]) in case of hbase:meta is not
partial, helps to initialise the meta properly without any further rebuilding
or ZooKeeper znode creation. Verified upgrade path from HBase 1.x to 2.5.2 once
in a cluster where master need to be restarted only once post updating meta
table schema. After starting the master all the tables and regions came up
without any issues.
{noformat}
2023-04-04 19:41:13,013 INFO [master/host023:16000:becomeActiveMaster]
hbase.ChoreService: Chore ScheduledChore name=SnapshotCleaner, period=1800000,
unit=MILLISECONDS is enabled.
2023-04-04 19:41:13,077 WARN [PEWorker-10] procedure.InitMetaProcedure: Can
not delete partial created meta table, continue...
2023-04-04 19:41:13,093 INFO [PEWorker-10] regionserver.HRegion: creating
{ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''},
tableDescriptor='hbase:meta', {TABLE_ATTRIBUTES => {IS_META => 'true',
coprocessor$1 =>
'|org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint|536870911|'}},
{NAME => 'info', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '10',
KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER',
MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'NONE', IN_MEMORY
=> 'true', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '8192 B
(8KB)', METADATA => {'CACHE_DATA_IN_L1' => 'true'}},
regionDir=hdfs://host023:8020/apps/hbase/data
2023-04-04 19:41:13,098 WARN [PEWorker-10] regionserver.HRegionFileSystem:
Trying to create a region that already exists on disk:
hdfs://host023:8020/apps/hbase/data/data/hbase/meta/1588230740
{noformat}
{noformat}
org.apache.hadoop.hbase.PleaseRestartMasterException: Aborting active master
after missing CFs are successfully added in meta. Subsequent active master
initialization should be uninterrupted
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1218)
at
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2394)
at
org.apache.hadoop.hbase.master.HMaster.lambda$null$0(HMaster.java:563)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:177)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$1(HMaster.java:560)
at java.lang.Thread.run(Thread.java:750)
2023-04-04 19:41:51,837 ERROR [master/sl73caehmapd023:16000:becomeActiveMaster]
master.HMaster: Master server abort: loaded coprocessors are: []
2023-04-04 19:41:51,837 ERROR [master/sl73caehmapd023:16000:becomeActiveMaster]
master.HMaster: ***** ABORTING master
sl73caehmapd023.visa.com,16000,1680637260893: Unhandled exception. Starting
shutdown. *****
org.apache.hadoop.hbase.PleaseRestartMasterException: Aborting active master
after missing CFs are successfully added in meta. Subsequent active master
initialization should be uninterrupted
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1218)
at
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2394)
at
org.apache.hadoop.hbase.master.HMaster.lambda$null$0(HMaster.java:563)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:177)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$1(HMaster.java:560)
at java.lang.Thread.run(Thread.java:750)
{noformat}
was (Author: rajeshbabu):
[~vjasani]
During express upgrade mainly from HBase 1.x to HBase 2.x as I have mentioned
in the above comment region states info for hbase:meta won't be identified
properly as neither the zookeeper not WALs files have the meta location because
no RS holding the meta location then it would be better to proceed with the
meta assignment instead of throwing exception when the hbase:meta directory is
not partial.
With this change of not throwing exception(Raised PR
https://github.com/apache/hbase/pull/5167) in case of hbase:meta is not
partial, helps to initialise the meta properly without any further rebuilding
or ZooKeeper znode creation. Verified upgrade path from HBase 1.x to 2.5.2 once
in a cluster where master need to be restarted only once post updating meta
table schema. After starting the master all the tables and regions came up
without any issues.
{noformat}
2023-04-04 19:41:13,013 INFO [master/host023:16000:becomeActiveMaster]
hbase.ChoreService: Chore ScheduledChore name=SnapshotCleaner, period=1800000,
unit=MILLISECONDS is enabled.
2023-04-04 19:41:13,077 WARN [PEWorker-10] procedure.InitMetaProcedure: Can
not delete partial created meta table, continue...
2023-04-04 19:41:13,093 INFO [PEWorker-10] regionserver.HRegion: creating
{ENCODED => 1588230740, NAME => 'hbase:meta,,1', STARTKEY => '', ENDKEY => ''},
tableDescriptor='hbase:meta', {TABLE_ATTRIBUTES => {IS_META => 'true',
coprocessor$1 =>
'|org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint|536870911|'}},
{NAME => 'info', INDEX_BLOCK_ENCODING => 'NONE', VERSIONS => '10',
KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER',
MIN_VERSIONS => '0', REPLICATION_SCOPE => '0', BLOOMFILTER => 'NONE', IN_MEMORY
=> 'true', COMPRESSION => 'NONE', BLOCKCACHE => 'true', BLOCKSIZE => '8192 B
(8KB)', METADATA => {'CACHE_DATA_IN_L1' => 'true'}},
regionDir=hdfs://host023:8020/apps/hbase/data
2023-04-04 19:41:13,098 WARN [PEWorker-10] regionserver.HRegionFileSystem:
Trying to create a region that already exists on disk:
hdfs://host023:8020/apps/hbase/data/data/hbase/meta/1588230740
{noformat}
{noformat}
org.apache.hadoop.hbase.PleaseRestartMasterException: Aborting active master
after missing CFs are successfully added in meta. Subsequent active master
initialization should be uninterrupted
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1218)
at
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2394)
at
org.apache.hadoop.hbase.master.HMaster.lambda$null$0(HMaster.java:563)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:177)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$1(HMaster.java:560)
at java.lang.Thread.run(Thread.java:750)
2023-04-04 19:41:51,837 ERROR [master/sl73caehmapd023:16000:becomeActiveMaster]
master.HMaster: Master server abort: loaded coprocessors are: []
2023-04-04 19:41:51,837 ERROR [master/sl73caehmapd023:16000:becomeActiveMaster]
master.HMaster: ***** ABORTING master
sl73caehmapd023.visa.com,16000,1680637260893: Unhandled exception. Starting
shutdown. *****
org.apache.hadoop.hbase.PleaseRestartMasterException: Aborting active master
after missing CFs are successfully added in meta. Subsequent active master
initialization should be uninterrupted
at
org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1218)
at
org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2394)
at
org.apache.hadoop.hbase.master.HMaster.lambda$null$0(HMaster.java:563)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:187)
at org.apache.hadoop.hbase.trace.TraceUtil.trace(TraceUtil.java:177)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$1(HMaster.java:560)
at java.lang.Thread.run(Thread.java:750)
{noformat}
> Migrate meta locations from zookeeper to master data may not always possible
> if we migrate from 1.x HBase
> ---------------------------------------------------------------------------------------------------------
>
> Key: HBASE-27698
> URL: https://issues.apache.org/jira/browse/HBASE-27698
> Project: HBase
> Issue Type: Bug
> Components: migration
> Affects Versions: 2.5.0
> Reporter: Rajeshbabu Chintaguntla
> Assignee: Rajeshbabu Chintaguntla
> Priority: Major
>
> In HBase 1.x versions meta server location from zookeeper will be removed
> when the server stopped. In such cases migrating to 2.5.x branches may not
> create any meta entries in master data. So in case if we could not find the
> meta location from zookeeper we can get meta location from wal directories
> with .meta extension and add to master data.
> {noformat}
> private void tryMigrateMetaLocationsFromZooKeeper() throws IOException,
> KeeperException {
> // try migrate data from zookeeper
> try (ResultScanner scanner =
> masterRegion.getScanner(new
> Scan().addFamily(HConstants.CATALOG_FAMILY))) {
> if (scanner.next() != null) {
> // notice that all replicas for a region are in the same row, so the
> migration can be
> // done with in a one row put, which means if we have data in catalog
> family then we can
> // make sure that the migration is done.
> LOG.info("The {} family in master local region already has data in
> it, skip migrating...",
> HConstants.CATALOG_FAMILY_STR);
> return;
> }
> }
> // start migrating
> byte[] row =
> CatalogFamilyFormat.getMetaKeyForRegion(RegionInfoBuilder.FIRST_META_REGIONINFO);
> Put put = new Put(row);
> List<String> metaReplicaNodes = zooKeeper.getMetaReplicaNodes();
> StringBuilder info = new StringBuilder("Migrating meta locations:");
> for (String metaReplicaNode : metaReplicaNodes) {
> int replicaId =
> zooKeeper.getZNodePaths().getMetaReplicaIdFromZNode(metaReplicaNode);
> RegionState state = MetaTableLocator.getMetaRegionState(zooKeeper,
> replicaId);
> info.append(" ").append(state);
> put.setTimestamp(state.getStamp());
> MetaTableAccessor.addRegionInfo(put, state.getRegion());
> if (state.getServerName() != null) {
> MetaTableAccessor.addLocation(put, state.getServerName(),
> HConstants.NO_SEQNUM, replicaId);
> }
>
> put.add(CellBuilderFactory.create(CellBuilderType.SHALLOW_COPY).setRow(put.getRow())
> .setFamily(HConstants.CATALOG_FAMILY)
>
> .setQualifier(RegionStateStore.getStateColumn(replicaId)).setTimestamp(put.getTimestamp())
>
> .setType(Cell.Type.Put).setValue(Bytes.toBytes(state.getState().name())).build());
> }
> if (!put.isEmpty()) {
> LOG.info(info.toString());
> masterRegion.update(r -> r.put(put));
> } else {
> LOG.info("No meta location available on zookeeper, skip migrating...");
> }
> }
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)