[jira] [Created] (HBASE-27700) rolling-restart.sh stop all masters at the same time

2023-03-09 Thread Jack Yang (Jira)
Jack Yang created HBASE-27700:
-

 Summary: rolling-restart.sh stop all masters at the same time
 Key: HBASE-27700
 URL: https://issues.apache.org/jira/browse/HBASE-27700
 Project: HBase
  Issue Type: Improvement
Reporter: Jack Yang


The rolling-restart.sh in $HBASE_HOME/bin would stop all master service 
(including the backup ones) at the same time, and then restart them at the same 
time:
{code:java}
# The content of rolling-restart.sh
...
# stop all masters before re-start to avoid races for master znode
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}" stop master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" stop master-backup

# make sure the master znode has been deleted before continuing
zmaster=`$bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool
zookeeper.znode.master`
...

# all masters are down, now restart
"$bin"/hbase-daemon.sh --config "${HBASE_CONF_DIR}"
${START_CMD_DIST_MODE} master
"$bin"/hbase-daemons.sh --config "${HBASE_CONF_DIR}" \
--hosts "${HBASE_BACKUP_MASTERS}" ${START_CMD_DIST_MODE} master-backup {code}
In this way the HMaster service would be unavailable during this period. We can 
restart them in a more graceful way, like this:
 * Stop the backup masters, and then restart them one by one
 * Stop the active master, then one of the backup master would become active
 * Start the original active master, now it's the backup one

Will upload patch soon.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27699) User metrics for filtered and read rows are too expensive

2023-03-09 Thread Bryan Beaudreault (Jira)
Bryan Beaudreault created HBASE-27699:
-

 Summary: User metrics for filtered and read rows are too expensive
 Key: HBASE-27699
 URL: https://issues.apache.org/jira/browse/HBASE-27699
 Project: HBase
  Issue Type: Improvement
Reporter: Bryan Beaudreault


MetricsUserAggregateImpl has a pattern like this:
{code:java}
String user = getActiveUser();
if (user != null) {
  MetricsUserSource userSource = getOrCreateMetricsUser(user);
  incrementFilteredReadRequests(userSource);
} {code}
So every update involves a getOrCreate call, which does a ConcurrentHashMap 
lookup. This overhead is not too bad for most requests, because it's just 
executed once per request (i.e. updatePut gets called once at the end, though 
multi's it happens for every action).

For updateFilteredReadRequests and updateReadRequestCount, these are currently 
called in RegionScannerImpl for every row scanned or filtered.  Doing the map 
lookup over and over adds up. Profiling the regionserver under load, I see over 
5% of the time spent updating these metrics.

We should try to collect these metrics maybe in the RpcCallContext, and the 
translate into user metrics once at the end of the request. Or otherwise find a 
way to minimize querying the ConcurrentHashMap multiple times in the context of 
a request. Maybe we should actually stash the MetricsUserSource in the 
RpcCallContext so that all user metrics only need to do the lookup once, even 
for multi's.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27698) Migrate meta locations from zookeeper may not always possible if we migrate from 1.x HBase

2023-03-09 Thread Rajeshbabu Chintaguntla (Jira)
Rajeshbabu Chintaguntla created HBASE-27698:
---

 Summary: Migrate meta locations from zookeeper may not always 
possible if we migrate from 1.x HBase
 Key: HBASE-27698
 URL: https://issues.apache.org/jira/browse/HBASE-27698
 Project: HBase
  Issue Type: Bug
  Components: migration
Reporter: Rajeshbabu Chintaguntla
Assignee: Rajeshbabu Chintaguntla


In HBase 1.x versions meta server location from zookeeper will be removed when 
the server stopped. In such cases migrating to 2.5.x branches may not create 
any meta entries in master data. So in case if we could not find the meta 
location from zookeeper we can meta location from wal directories with .meta 
extension and add to master data.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] Remove checkRefCnt for on-heap ByteBuffs?

2023-03-09 Thread Bryan Beaudreault
Hey all,

We have a use-case for HFile.Reader at my company. The use-case in question
is scanning many hfiles for a small subset of cells, so it mostly is
just iterating a large number of HFiles and discarding most of the cells.
We recently upgraded that deployable from super old cdh 5.16 (hbase
1.2-ish) to hbase 2.5.3. In doing so, we noticed a performance regression
of around 4x. I imagine this regression would also affect
ClientSideRegionScanner.

We did some profiling and noticed that a large amount of time is spent in
SingleByteBuff.checkRefCnt. It seems like every SingleByteBuff method calls
checkRefCnt and this checks compares a volatile
in AtomicIntegerFieldUpdater in the netty code.

I believe ref counting is mostly necessary for off-heap buffers, but
on-heap buffers are also wrapped in SingleByteBuff and so also go through
checkRefCnt. We removed the checkRefCnt call, and the regression
disappeared.

We created a simple test method which just does HFile.createReader,
reader.getScanner(), and then iterates the scanner counting the total
cells. The test reads an hfile with 100M cells and takes  over 1 minute
with checkRefCnt. Removing checkRefCnt brings the runtime down to 20s.

It's worth noting that the regression is most prominent on java17. It's
slightly less obvious on java11, with runtime being 40s vs 28s.

Thoughts on updating SingleByteBuff to skip the checkRefCnt call for
on-heap buffers? We can handle this in the constructor, when wrapping an
on-heap buffer here [1].

[1]
https://github.com/apache/hbase/blob/master/hbase-common/src/main/java/org/apache/hadoop/hbase/io/ByteBuffAllocator.java#L300


[jira] [Created] (HBASE-27697) Create a dummy metric implementation in ConnectionImplementation.

2023-03-09 Thread Rushabh Shah (Jira)
Rushabh Shah created HBASE-27697:


 Summary: Create a dummy metric implementation in 
ConnectionImplementation.
 Key: HBASE-27697
 URL: https://issues.apache.org/jira/browse/HBASE-27697
 Project: HBase
  Issue Type: Bug
  Components: metrics
Affects Versions: 2.5.0
Reporter: Rushabh Shah


This Jira is for branch-2 only.

If CLIENT_SIDE_METRICS_ENABLED_KEY conf is set to true, then we initialize 
metrics to MetricsConnection, otherwise it is set to null.
{code:java}
  if (conf.getBoolean(CLIENT_SIDE_METRICS_ENABLED_KEY, false)) {
  this.metricsScope = MetricsConnection.getScope(conf, clusterId, 
this);
  this.metrics = 
MetricsConnection.getMetricsConnection(this.metricsScope, this::getBatchPool, 
this::getMetaLookupPool);
  } else {
  this.metrics = null;  
  }
{code}

Whenever we want to update metrics, we always do a null check. We can improve 
this by creating a dummy metrics object and have an empty implementation. When 
we want to populate the metrics, we can check if metrics is an instance of 
dummy metrics.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] Deprecated the 'hbase.regionserver.hlog.reader.impl' and 'hbase.regionserver.hlog.writer.impl' configurations

2023-03-09 Thread Duo Zhang
Going to merge the PR tomorrow if there are no other objections.

Thanks.

张铎(Duo Zhang)  于2023年3月3日周五 22:30写道:

> The PR for HBASE-27632 is ready for review now.
>
> In this PR we just removed 'hbase.regionserver.hlog.reader.impl'.
>
> Will file other issues to remove 'hbase.regionserver.hlog.writer.impl' and
> also update the ref guide.
>
> The most important changes for the PR are:
>
>1. Introduced two types of WAL readers, WALStreamReader and
>WALTailingReader. The implementation classes are ProtobufWALStreamReader
>and ProtobufWALTailingReader, and they share the same base class
>AbstractProtobufWALReader.
>2. Now, you do not need to specify SecureProtobufLogReader for reading
>an encrypted WAL file, the ProtobufWALStreamReader and
>ProtobufWALTailingReader will automatically choose to use
>SecureWALCellCodec if the file is written by SecureProtobufLogWriter.
>3. For WALStreamReader, we still need to the ability to read from a
>specific position(in WALInputFormat, for recovering from an error), instead
>of introducing a seek method, we add a startPosition parameter when
>creating a WALStreamReader.
>4. The logging splitting related logics are almost the same, as the
>API does not change too much.
>5. For WALTailingReader, it does not throw any exceptions now, we
>introduce a State to indicate the reader's state and what you need to do
>after issuing a next call. So the WALEntryStream is also changed a lot, as
>well as ReplicationSourceWALReader.
>6. For verifying that we could correctly deal with partial WAL file,
>we introduced a TestParsePartialWALFile, please see the comments of this
>file for more details.
>7. The hbase.regionserver.hlog.reader.impl is removed, as now we have
>two types of reader implementations. And in general, after removing the
>necessity of specify SecureProtobufLogReader when reading encrypted WAL
>file, users do not need to specify WAL reader any more. For writing UTs, we
>introduced a hbase.regionserver.wal.stream.reader.impl config, to specify
>the implementation class for WALStreamReader in tests.
>
> PTAL if you have interest.
>
> Thanks.
>
> 唐天航  于2023年2月15日周三 18:53写道:
>
>> Make sense. Thank you.
>>
>> 张铎(Duo Zhang)  于2023年2月15日周三 18:24写道:
>>
>> > We could discuss how to implement BookKeeper based WAL in the future.
>> >
>> > For me, I would like to abstract at a higher level, as we do not need to
>> > have a 'file' when using BookKeeper. But this requires changing the
>> > replication abstraction too.
>> >
>> > Thanks.
>> >
>> > 唐天航  于2023年2月15日周三 11:42写道:
>> >
>> > > Can we keep these configs and add a new one for the replication
>> reader?
>> > >
>> > > As @Andrew said, One of the things we are doing is using BookKeeper
>> for
>> > WAL
>> > > storage. This depends on the configuration above. Although we are
>> > > developing based on branch-1, and it is too early to talk about
>> joining
>> > the
>> > > community, I'm not sure what the attitude of the community is,
>> whether it
>> > > is willing to accept implementations based on other storage in the
>> > future.
>> > >
>> > > Thanks.
>> > >
>> > >
>> > > 张铎(Duo Zhang)  于2023年2月14日周二 21:20写道:
>> > >
>> > > > Thanks Andrew for the feedback.
>> > > >
>> > > > If no other concerns, I will start the refactoring work and
>> deprecated
>> > > > these two configs.
>> > > >
>> > > > Thanks.
>> > > >
>> > > > Andrew Purtell  于2023年2月10日周五 23:01写道:
>> > > >
>> > > > > As you point out these configuration settings were introduced
>> when we
>> > > > > migrated from SequenceFile based WALs to the protobuf format. We
>> > needed
>> > > > to
>> > > > > give users a way to manually migrate, although, arguably, an auto
>> > > > migration
>> > > > > would have been better.
>> > > > >
>> > > > > In theory these settings allow users to implement their own WAL
>> > readers
>> > > > > and writers. However I do not believe users will do this. The WAL
>> is
>> > > > > critical for performance and correctness. If anyone is
>> contemplating
>> > > such
>> > > > > wizard level changes they can patch the code themselves. It’s
>> fine to
>> > > > > document these settings as deprecated for sure, and I think ok
>> also
>> > to
>> > > > > claim them unsupported and ignored.
>> > > > >
>> > > > > >
>> > > > > > On Feb 10, 2023, at 3:41 AM, 张铎  wrote:
>> > > > > >
>> > > > > > While discussing how to deal with the problem in HBASE-27621,
>> we
>> > > > > proposed
>> > > > > > to introduce two types of WAL readers, one for WAL splitting,
>> and
>> > the
>> > > > > other
>> > > > > > for WAL replication, as replication needs to tail the WAL file
>> > which
>> > > is
>> > > > > > currently being written, so the logic is much more complicated.
>> We
>> > do
>> > > > not
>> > > > > > want to affect WAL splitting logic and performance while
>> tweaking
>> > the
>> > > > > > replication related things, as all HBase users need WAL
>> splitting
>> >