[jira] [Created] (HBASE-27700) rolling-restart.sh stop all masters at the same time
Jack Yang created HBASE-27700: - Summary: rolling-restart.sh stop all masters at the same time Key: HBASE-27700 URL: https://issues.apache.org/jira/browse/HBASE-27700 Project: HBase Issue Type: Improvement Reporter: Jack Yang The rolling-restart.sh in $HBASE_HOME/bin would stop all master service (including the backup ones) at the same time, and then restart them at the same time: {code:java} # The content of rolling-restart.sh ... # stop all masters before re-start to avoid races for master znode "$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" stop master "$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \ --hosts "${HBASE_BACKUP_MASTERS}" stop master-backup # make sure the master znode has been deleted before continuing zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool zookeeper.znode.master` ... # all masters are down, now restart "$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" ${START_CMD_DIST_MODE} master "$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \ --hosts "${HBASE_BACKUP_MASTERS}" ${START_CMD_DIST_MODE} master-backup {code} In this way the HMaster service would be unavailable during this period. We can restart them in a more graceful way, like this: * Stop the backup masters, and then restart them one by one * Stop the active master, then one of the backup master would become active * Start the original active master, now it's the backup one Will upload patch soon. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27699) User metrics for filtered and read rows are too expensive
Bryan Beaudreault created HBASE-27699: - Summary: User metrics for filtered and read rows are too expensive Key: HBASE-27699 URL: https://issues.apache.org/jira/browse/HBASE-27699 Project: HBase Issue Type: Improvement Reporter: Bryan Beaudreault MetricsUserAggregateImpl has a pattern like this: {code:java} String user = getActiveUser(); if (user != null) { MetricsUserSource userSource = getOrCreateMetricsUser(user); incrementFilteredReadRequests(userSource); } {code} So every update involves a getOrCreate call, which does a ConcurrentHashMap lookup. This overhead is not too bad for most requests, because it's just executed once per request (i.e. updatePut gets called once at the end, though multi's it happens for every action). For updateFilteredReadRequests and updateReadRequestCount, these are currently called in RegionScannerImpl for every row scanned or filtered. Doing the map lookup over and over adds up. Profiling the regionserver under load, I see over 5% of the time spent updating these metrics. We should try to collect these metrics maybe in the RpcCallContext, and the translate into user metrics once at the end of the request. Or otherwise find a way to minimize querying the ConcurrentHashMap multiple times in the context of a request. Maybe we should actually stash the MetricsUserSource in the RpcCallContext so that all user metrics only need to do the lookup once, even for multi's. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (HBASE-27698) Migrate meta locations from zookeeper may not always possible if we migrate from 1.x HBase
Rajeshbabu Chintaguntla created HBASE-27698: --- Summary: Migrate meta locations from zookeeper may not always possible if we migrate from 1.x HBase Key: HBASE-27698 URL: https://issues.apache.org/jira/browse/HBASE-27698 Project: HBase Issue Type: Bug Components: migration Reporter: Rajeshbabu Chintaguntla Assignee: Rajeshbabu Chintaguntla In HBase 1.x versions meta server location from zookeeper will be removed when the server stopped. In such cases migrating to 2.5.x branches may not create any meta entries in master data. So in case if we could not find the meta location from zookeeper we can meta location from wal directories with .meta extension and add to master data. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[DISCUSS] Remove checkRefCnt for on-heap ByteBuffs?
Hey all, We have a use-case for HFile.Reader at my company. The use-case in question is scanning many hfiles for a small subset of cells, so it mostly is just iterating a large number of HFiles and discarding most of the cells. We recently upgraded that deployable from super old cdh 5.16 (hbase 1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance regression of around 4x. I imagine this regression would also affect ClientSideRegionScanner. We did some profiling and noticed that a large amount of time is spent in SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method calls checkRefCnt and this checks compares a volatile in AtomicIntegerFieldUpdater in the netty code. I believe ref counting is mostly necessary for off-heap buffers, but on-heap buffers are also wrapped in SingleByteBuff and so also go through checkRefCnt. We removed the checkRefCnt call, and the regression disappeared. We created a simple test method which just does HFile.createReader, reader.getScanner(), and then iterates the scanner counting the total cells. The test reads an hfile with 100M cells and takes over 1 minute with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s. It's worth noting that the regression is most prominent on java17. It's slightly less obvious on java11, with runtime being 40s vs 28s. Thoughts on updating SingleByteBuff to skip the checkRefCnt call for on-heap buffers? We can handle this in the constructor, when wrapping an on-heap buffer here [1]. [1] https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300
[jira] [Created] (HBASE-27697) Create a dummy metric implementation in ConnectionImplementation.
Rushabh Shah created HBASE-27697: Summary: Create a dummy metric implementation in ConnectionImplementation. Key: HBASE-27697 URL: https://issues.apache.org/jira/browse/HBASE-27697 Project: HBase Issue Type: Bug Components: metrics Affects Versions: 2.5.0 Reporter: Rushabh Shah This Jira is for branch-2 only. If CLIENT_SIDE_METRICS_ENABLED_KEY conf is set to true, then we initialize metrics to MetricsConnection, otherwise it is set to null. {code:java} if (conf.getBoolean(CLIENT_SIDE_METRICS_ENABLED_KEY, false)) { this.metricsScope = MetricsConnection.getScope(conf, clusterId, this); this.metrics = MetricsConnection.getMetricsConnection(this.metricsScope, this::getBatchPool, this::getMetaLookupPool); } else { this.metrics = null; } {code} Whenever we want to update metrics, we always do a null check. We can improve this by creating a dummy metrics object and have an empty implementation. When we want to populate the metrics, we can check if metrics is an instance of dummy metrics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
Re: [DISCUSS] Deprecated the 'hbase.regionserver.hlog.reader.impl' and 'hbase.regionserver.hlog.writer.impl' configurations
Going to merge the PR tomorrow if there are no other objections. Thanks. 张铎(Duo Zhang) 于2023年3月3日周五 22:30写道: > The PR for HBASE-27632 is ready for review now. > > In this PR we just removed 'hbase.regionserver.hlog.reader.impl'. > > Will file other issues to remove 'hbase.regionserver.hlog.writer.impl' and > also update the ref guide. > > The most important changes for the PR are: > >1. Introduced two types of WAL readers, WALStreamReader and >WALTailingReader. The implementation classes are ProtobufWALStreamReader >and ProtobufWALTailingReader, and they share the same base class >AbstractProtobufWALReader. >2. Now, you do not need to specify SecureProtobufLogReader for reading >an encrypted WAL file, the ProtobufWALStreamReader and >ProtobufWALTailingReader will automatically choose to use >SecureWALCellCodec if the file is written by SecureProtobufLogWriter. >3. For WALStreamReader, we still need to the ability to read from a >specific position(in WALInputFormat, for recovering from an error), instead >of introducing a seek method, we add a startPosition parameter when >creating a WALStreamReader. >4. The logging splitting related logics are almost the same, as the >API does not change too much. >5. For WALTailingReader, it does not throw any exceptions now, we >introduce a State to indicate the reader's state and what you need to do >after issuing a next call. So the WALEntryStream is also changed a lot, as >well as ReplicationSourceWALReader. >6. For verifying that we could correctly deal with partial WAL file, >we introduced a TestParsePartialWALFile, please see the comments of this >file for more details. >7. The hbase.regionserver.hlog.reader.impl is removed, as now we have >two types of reader implementations. And in general, after removing the >necessity of specify SecureProtobufLogReader when reading encrypted WAL >file, users do not need to specify WAL reader any more. For writing UTs, we >introduced a hbase.regionserver.wal.stream.reader.impl config, to specify >the implementation class for WALStreamReader in tests. > > PTAL if you have interest. > > Thanks. > > 唐天航 于2023年2月15日周三 18:53写道: > >> Make sense. Thank you. >> >> 张铎(Duo Zhang) 于2023年2月15日周三 18:24写道: >> >> > We could discuss how to implement BookKeeper based WAL in the future. >> > >> > For me, I would like to abstract at a higher level, as we do not need to >> > have a 'file' when using BookKeeper. But this requires changing the >> > replication abstraction too. >> > >> > Thanks. >> > >> > 唐天航 于2023年2月15日周三 11:42写道: >> > >> > > Can we keep these configs and add a new one for the replication >> reader? >> > > >> > > As @Andrew said, One of the things we are doing is using BookKeeper >> for >> > WAL >> > > storage. This depends on the configuration above. Although we are >> > > developing based on branch-1, and it is too early to talk about >> joining >> > the >> > > community, I'm not sure what the attitude of the community is, >> whether it >> > > is willing to accept implementations based on other storage in the >> > future. >> > > >> > > Thanks. >> > > >> > > >> > > 张铎(Duo Zhang) 于2023年2月14日周二 21:20写道: >> > > >> > > > Thanks Andrew for the feedback. >> > > > >> > > > If no other concerns, I will start the refactoring work and >> deprecated >> > > > these two configs. >> > > > >> > > > Thanks. >> > > > >> > > > Andrew Purtell 于2023年2月10日周五 23:01写道: >> > > > >> > > > > As you point out these configuration settings were introduced >> when we >> > > > > migrated from SequenceFile based WALs to the protobuf format. We >> > needed >> > > > to >> > > > > give users a way to manually migrate, although, arguably, an auto >> > > > migration >> > > > > would have been better. >> > > > > >> > > > > In theory these settings allow users to implement their own WAL >> > readers >> > > > > and writers. However I do not believe users will do this. The WAL >> is >> > > > > critical for performance and correctness. If anyone is >> contemplating >> > > such >> > > > > wizard level changes they can patch the code themselves. It’s >> fine to >> > > > > document these settings as deprecated for sure, and I think ok >> also >> > to >> > > > > claim them unsupported and ignored. >> > > > > >> > > > > > >> > > > > > On Feb 10, 2023, at 3:41 AM, 张铎 wrote: >> > > > > > >> > > > > > While discussing how to deal with the problem in HBASE-27621, >> we >> > > > > proposed >> > > > > > to introduce two types of WAL readers, one for WAL splitting, >> and >> > the >> > > > > other >> > > > > > for WAL replication, as replication needs to tail the WAL file >> > which >> > > is >> > > > > > currently being written, so the logic is much more complicated. >> We >> > do >> > > > not >> > > > > > want to affect WAL splitting logic and performance while >> tweaking >> > the >> > > > > > replication related things, as all HBase users need WAL >> splitting >> >