Going to merge the PR tomorrow if there are no other objections. Thanks.
张铎(Duo Zhang) <palomino...@gmail.com> 于2023年3月3日周五 22:30写道: > The PR for HBASE-27632 is ready for review now. > > In this PR we just removed 'hbase.regionserver.hlog.reader.impl'. > > Will file other issues to remove 'hbase.regionserver.hlog.writer.impl' and > also update the ref guide. > > The most important changes for the PR are: > > 1. Introduced two types of WAL readers, WALStreamReader and > WALTailingReader. The implementation classes are ProtobufWALStreamReader > and ProtobufWALTailingReader, and they share the same base class > AbstractProtobufWALReader. > 2. Now, you do not need to specify SecureProtobufLogReader for reading > an encrypted WAL file, the ProtobufWALStreamReader and > ProtobufWALTailingReader will automatically choose to use > SecureWALCellCodec if the file is written by SecureProtobufLogWriter. > 3. For WALStreamReader, we still need to the ability to read from a > specific position(in WALInputFormat, for recovering from an error), instead > of introducing a seek method, we add a startPosition parameter when > creating a WALStreamReader. > 4. The logging splitting related logics are almost the same, as the > API does not change too much. > 5. For WALTailingReader, it does not throw any exceptions now, we > introduce a State to indicate the reader's state and what you need to do > after issuing a next call. So the WALEntryStream is also changed a lot, as > well as ReplicationSourceWALReader. > 6. For verifying that we could correctly deal with partial WAL file, > we introduced a TestParsePartialWALFile, please see the comments of this > file for more details. > 7. The hbase.regionserver.hlog.reader.impl is removed, as now we have > two types of reader implementations. And in general, after removing the > necessity of specify SecureProtobufLogReader when reading encrypted WAL > file, users do not need to specify WAL reader any more. For writing UTs, we > introduced a hbase.regionserver.wal.stream.reader.impl config, to specify > the implementation class for WALStreamReader in tests. > > PTAL if you have interest. > > Thanks. > > 唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 18:53写道: > >> Make sense. Thank you. >> >> 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月15日周三 18:24写道: >> >> > We could discuss how to implement BookKeeper based WAL in the future. >> > >> > For me, I would like to abstract at a higher level, as we do not need to >> > have a 'file' when using BookKeeper. But this requires changing the >> > replication abstraction too. >> > >> > Thanks. >> > >> > 唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 11:42写道: >> > >> > > Can we keep these configs and add a new one for the replication >> reader? >> > > >> > > As @Andrew said, One of the things we are doing is using BookKeeper >> for >> > WAL >> > > storage. This depends on the configuration above. Although we are >> > > developing based on branch-1, and it is too early to talk about >> joining >> > the >> > > community, I'm not sure what the attitude of the community is, >> whether it >> > > is willing to accept implementations based on other storage in the >> > future. >> > > >> > > Thanks. >> > > >> > > >> > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月14日周二 21:20写道: >> > > >> > > > Thanks Andrew for the feedback. >> > > > >> > > > If no other concerns, I will start the refactoring work and >> deprecated >> > > > these two configs. >> > > > >> > > > Thanks. >> > > > >> > > > Andrew Purtell <andrew.purt...@gmail.com> 于2023年2月10日周五 23:01写道: >> > > > >> > > > > As you point out these configuration settings were introduced >> when we >> > > > > migrated from SequenceFile based WALs to the protobuf format. We >> > needed >> > > > to >> > > > > give users a way to manually migrate, although, arguably, an auto >> > > > migration >> > > > > would have been better. >> > > > > >> > > > > In theory these settings allow users to implement their own WAL >> > readers >> > > > > and writers. However I do not believe users will do this. The WAL >> is >> > > > > critical for performance and correctness. If anyone is >> contemplating >> > > such >> > > > > wizard level changes they can patch the code themselves. It’s >> fine to >> > > > > document these settings as deprecated for sure, and I think ok >> also >> > to >> > > > > claim them unsupported and ignored. >> > > > > >> > > > > > >> > > > > > On Feb 10, 2023, at 3:41 AM, 张铎 <palomino...@gmail.com> wrote: >> > > > > > >> > > > > > While discussing how to deal with the problem in HBASE-27621, >> we >> > > > > proposed >> > > > > > to introduce two types of WAL readers, one for WAL splitting, >> and >> > the >> > > > > other >> > > > > > for WAL replication, as replication needs to tail the WAL file >> > which >> > > is >> > > > > > currently being written, so the logic is much more complicated. >> We >> > do >> > > > not >> > > > > > want to affect WAL splitting logic and performance while >> tweaking >> > the >> > > > > > replication related things, as all HBase users need WAL >> splitting >> > but >> > > > not >> > > > > > everyone needs replication. >> > > > > > >> > > > > > But when reviewing the related code, I found that we have two >> > > > > > configurations for specifying the WAL reader class and WAL write >> > > class, >> > > > > > which indicates that we could only have one implementation for >> the >> > > WAL >> > > > > > reader. They are 'hbase.regionserver.hlog.reader.impl' and >> > > > > > 'hbase.regionserver.hlog.writer.impl'. >> > > > > > >> > > > > > We mentioned these two configurations several times in our ref >> > guide. >> > > > > > >> > > > > > HBase 2.0+ can no longer read Sequence File based WAL file. >> > > > > > >> > > > > > HBase can no longer read the deprecated WAL files written in the >> > > Apache >> > > > > >> Hadoop Sequence File format. The >> > hbase.regionserver.hlog.reader.impl >> > > > and >> > > > > >> hbase.regionserver.hlog.writer.impl configuration entries >> should >> > be >> > > > set >> > > > > to >> > > > > >> use the Protobuf based WAL reader / writer classes. This >> > > > implementation >> > > > > has >> > > > > >> been the default since HBase 0.96, so legacy WAL files should >> not >> > > be a >> > > > > >> concern for most downstream users. >> > > > > > >> > > > > > >> > > > > > Configure WAL encryption. >> > > > > > >> > > > > > Configure WAL encryption in every RegionServer’s >> hbase-site.xml, by >> > > > > setting >> > > > > >> the following properties. You can include these in the >> HMaster’s >> > > > > >> hbase-site.xml as well, but the HMaster does not have a WAL and >> > will >> > > > not >> > > > > >> use them. >> > > > > >> <property> >> > > > > >> <name>hbase.regionserver.hlog.reader.impl</name> >> > > > > >> >> > > > > >> >> > > > > >> > > > >> > > >> > >> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value> >> > > > > >> </property> >> > > > > >> <property> >> > > > > >> <name>hbase.regionserver.hlog.writer.impl</name> >> > > > > >> >> > > > > >> >> > > > > >> > > > >> > > >> > >> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value> >> > > > > >> </property> >> > > > > >> <property> >> > > > > >> <name>hbase.regionserver.wal.encryption</name> >> > > > > >> <value>true</value> >> > > > > >> </property> >> > > > > > >> > > > > > >> > > > > > So in fact, do not consider encryption, the configurations are >> > > useless >> > > > as >> > > > > > we do not support reading sequence file format WAL any more, the >> > only >> > > > > valid >> > > > > > options are protobuf based reader and write. And for security, I >> > > think >> > > > > the >> > > > > > configuration is redundant as if encryption is enabled, we >> should >> > use >> > > > > > SecureProtobufLogWriter for writing, no matter what the >> > configuration >> > > > > value >> > > > > > is. And for readers, I do not think we should use a >> configuration >> > to >> > > > > > specify the implementation, we should detect whether the file is >> > > > > encrypted >> > > > > > and choose a secure or normal reader to read the file. >> > > > > > >> > > > > > So here, I propose we just deprecated these two configurations >> > > because >> > > > > they >> > > > > > are useless now. >> > > > > > >> > > > > > Thanks. >> > > > > >> > > > >> > > >> > >> >