Re: [DISCUSS] Move replication queue storage from zookeeper to a separated HBase table

2022-09-01 Thread 何良均
Meetings notes:

Attendees: Duo Zhang, Yu Li, Xin Sun, Tianhang Tang, Liangjun He

First Liangjun introduced the old implementation of 
ReplicationSyncUp/DumpReplicationQueues, as well as the existing problems and 
preliminary solutions under the new replication implementation. Then we 
discussed the relevant solutions, the following is the content of the 
discussion:

ReplicationSyncUp tool

1. The ReplicationSyncUp tool can replicate the remaining data to the backup 
cluster when the master cluster crashes , if the master cluster crashes, the 
tool cannot access the hbase table. Is it possible to copy the data of 
replication queue info to ZK when it is written to the hbase table, and then 
implement the ReplicationSyncUp tool based on ZK again?

Since our goal is to reduce the reliance of ZK when replication queue info is 
stored, this will break our goal. Maybe we can use the HMaster maintenance mode 
and pull up hbase:meta, and then perform additional repair operations to solve 
the problem, but considering that the HMaster maintenance mode only supports 
HMaster internal access to hbase:meta, the external cannot access , so this way 
cannot be used.

After discussion, we all agree that if it does not rely on external storage 
under the new replication implementation, it is difficult to solve the problem. 
Then if ZK (or third-party storage system) is used, we will have a data sync 
problem, including how to sync-writing replication queue info to ZK (if it is 
real-time sync-writing, how to ensure the consistency between replication queue 
info writing to hbase table and ZK , if it is timed sync-writing, some 
redundant data will be replicated when ReplicationSyncUp is executed. Of 
course, partially redundant data may be acceptable), and when the master 
cluster is recovered, how to sync-writing the replication queue info back to 
hbase:replication table from ZK .

Further, we can also solve the problem of ReplicationSyncUp accessing 
replication queue info based on the snapshot of the hbase:replicaiton table, 
for example, the snapshot of the hbase:replication table is periodically 
generated, and then when the ReplicationSyncUp tool is executed, the snapshot 
of the hbase:replication table is loaded into the memory. After the 
ReplicationSyncUp tool is executed and the data is replicated completely, we 
will regenerate a new snapshot based on the memory info and write it to the 
file system. When the master cluster is recovered, the HMaster will restore the 
hbase:meta table from the new snapshot.

2. If the ReplicationSyncUp tool is implemented based on the hbase:replication 
snapshot, after ReplicationSyncUp is executed and the data is replicated 
completely, it is necessary to ensure that the HMaster is started first when 
the master cluster is recovered, and the snapshot is restored to the 
hbase:replicaion , so as to avoid the situation where the RegionServer is 
started first and then the redundant data is replicated to the backup cluster, 
but the master cluster cannot guarantee that the HMaster will be started before 
the RegionServer when the cluster is recovered, so how to ensure that the 
HMaster first restores the snapshot to the hbase:replicaion table?

Option 1: If RegionServer is started first, RegionServer determines that if 
there is corresponding snapshot of hbase:replication table, RegionServer 
replication related operations will wait until the HMaster starts and restores 
the snapshot to hbase:replicaiton table, then RegionServer will continue to 
replicate data. The advantage of this way is that it is transparent to the 
user, but the implementation is complicated.

Option 2: After ReplicationSyncUp is executed, we disable the peer. Even if the 
RegionServer is started first, the replication operation will not be executed 
until the HMaster starts and restores the snapshot to the hbase:replicaiton 
table. The disadvantage of this way is that it will cause confusion to the 
user, because the peer is disabled when the user is unknown, and the advantage 
is that the implementation is simple.

At present, it seems that there is no such solution that can solve the problem 
perfectly.

DumpReplicationQueues tool

1. Under the new replication implementation, most of the info output by the 
DumpReplicationQueues tool can be obtained through the new interface, which is 
consistent with the old implementation, but the difference from the old 
implementation is that each queue in the new replication implementation will 
only save one wal and the corresponding offset info, while the old 
implementation will save all wal files and offset info under the queue, so the 
old DumpReplicationQueues tool will include all wal files and offset info when 
outputting queue info. In the new implementation, we can also directly access 
the file system to get all the wal files corresponding to the queue, which can 
be completely consistent with the output info of the old DumpReplicationQueues 
tool, 

[jira] [Reopened] (HBASE-25563) Add "2.4 Documentation" to the website

2022-09-01 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk reopened HBASE-25563:
--

Reopening to add the 2.4 menu entry to the main site.

> Add "2.4 Documentation" to the website
> --
>
> Key: HBASE-25563
> URL: https://issues.apache.org/jira/browse/HBASE-25563
> Project: HBase
>  Issue Type: Task
>  Components: community, documentation
>Affects Versions: 3.0.0-alpha-1
>Reporter: Nick Dimiduk
>Assignee: Andrew Kyle Purtell
>Priority: Major
>
> Docs from the 2.4.0 build should be slotted into the website.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (HBASE-27351) Add 2.5 Documentation to the website

2022-09-01 Thread Nick Dimiduk (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk resolved HBASE-27351.
--
Resolution: Fixed

> Add 2.5 Documentation to the website
> 
>
> Key: HBASE-27351
> URL: https://issues.apache.org/jira/browse/HBASE-27351
> Project: HBase
>  Issue Type: Task
>  Components: community
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
>
> Following the example from HBASE-24487, will create PR for both the main repo 
> and the hbase-site repo.
> Generating the site documentation from main repo, copy to site, as follows:
> {noformat}
> hbase $ git checkout rel/2.5.0
> hbase $ mvn clean site site:stage -DskipTests
> hbase-site $ git checkout asf-site
> hbase-site $ cp -r ../hbase/target/staging ./2.5
> hbase-site $ git add 2.5
> {noformat}
> Then, update the site structure to include a reference to the new content, 
> following https://github.com/apache/hbase/pull/2060



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27354) EOF thrown by WALEntryStream causes replication blocking

2022-09-01 Thread Sun Xin (Jira)
Sun Xin created HBASE-27354:
---

 Summary: EOF thrown by WALEntryStream causes replication blocking
 Key: HBASE-27354
 URL: https://issues.apache.org/jira/browse/HBASE-27354
 Project: HBase
  Issue Type: Bug
  Components: Replication
Affects Versions: 2.4.14, 3.0.0-alpha-3, 2.5.0, 2.6.0
Reporter: Sun Xin
Assignee: Sun Xin


In 
[WALEntryStream#readNextEntryAndRecordReaderPosition|https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/WALEntryStream.java#L257],
 it is possible that we read uncommitted data.  If we read beyond the committed 
file length, then reopen the 

inputStream and seek back.

In our use, we found that the position where seek back may be exactly the 
length of the file  being written, which may cause EOF.

The thrown EOF is finally caught 
[ReplicationSourceWALReader.run|https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L158],
 but 
[totalBufferUsed|https://github.com/apache/hbase/blob/308cd729d23329e6d8d4b9c17a645180374b5962/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceWALReader.java#L78]
 is not cleanup up.

After a long run, all peers will go slow and eventually block completely.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27353) opentelemetry-context jar missing at runtime causes MR jobs to fail

2022-09-01 Thread Ujjawal Kumar (Jira)
Ujjawal Kumar created HBASE-27353:
-

 Summary: opentelemetry-context jar missing at runtime causes MR 
jobs to fail
 Key: HBASE-27353
 URL: https://issues.apache.org/jira/browse/HBASE-27353
 Project: HBase
  Issue Type: Bug
Affects Versions: 2.5.0, 3.0.0-alpha-1
Reporter: Ujjawal Kumar


MR jobs like RowCounter, Import have a dependency on opentelemetry-context via 
TableInputFormat -> ConnectionFactory -> 
io.opentelemetry.context.ImplicitContextKeyed which causes these jobs to fail 
with missing class. This would be an addendum on top of[ 
HBASE-25811|https://issues.apache.org/jira/browse/HBASE-25811] (which added 
opentelemetry-api and opentelemetry-semconv) 


Error: java.lang.NoClassDefFoundError: 
io/opentelemetry/context/ImplicitContextKeyed
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:757)
at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:473)
at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:365)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:218)
at 
org.apache.hadoop.hbase.client.ConnectionFactory.createConnection(ConnectionFactory.java:131)
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormat.initialize(TableInputFormat.java:193)
at 
org.apache.hadoop.hbase.mapreduce.TableInputFormatBase.createRecordReader(TableInputFormatBase.java:162)
at 
org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.(MapTask.java:528)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:771)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)
Caused by: java.lang.ClassNotFoundException: 
io.opentelemetry.context.ImplicitContextKeyed
at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:365)
at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
... 24 more



--
This message was sent by Atlassian Jira
(v8.20.10#820010)