Going to merge the PR tomorrow if there are no other objections.

Thanks.

张铎(Duo Zhang) <palomino...@gmail.com> 于2023年3月3日周五 22:30写道:

> The PR for HBASE-27632 is ready for review now.
>
> In this PR we just removed 'hbase.regionserver.hlog.reader.impl'.
>
> Will file other issues to remove 'hbase.regionserver.hlog.writer.impl' and
> also update the ref guide.
>
> The most important changes for the PR are:
>
>    1. Introduced two types of WAL readers, WALStreamReader and
>    WALTailingReader. The implementation classes are ProtobufWALStreamReader
>    and ProtobufWALTailingReader, and they share the same base class
>    AbstractProtobufWALReader.
>    2. Now, you do not need to specify SecureProtobufLogReader for reading
>    an encrypted WAL file, the ProtobufWALStreamReader and
>    ProtobufWALTailingReader will automatically choose to use
>    SecureWALCellCodec if the file is written by SecureProtobufLogWriter.
>    3. For WALStreamReader, we still need to the ability to read from a
>    specific position(in WALInputFormat, for recovering from an error), instead
>    of introducing a seek method, we add a startPosition parameter when
>    creating a WALStreamReader.
>    4. The logging splitting related logics are almost the same, as the
>    API does not change too much.
>    5. For WALTailingReader, it does not throw any exceptions now, we
>    introduce a State to indicate the reader's state and what you need to do
>    after issuing a next call. So the WALEntryStream is also changed a lot, as
>    well as ReplicationSourceWALReader.
>    6. For verifying that we could correctly deal with partial WAL file,
>    we introduced a TestParsePartialWALFile, please see the comments of this
>    file for more details.
>    7. The hbase.regionserver.hlog.reader.impl is removed, as now we have
>    two types of reader implementations. And in general, after removing the
>    necessity of specify SecureProtobufLogReader when reading encrypted WAL
>    file, users do not need to specify WAL reader any more. For writing UTs, we
>    introduced a hbase.regionserver.wal.stream.reader.impl config, to specify
>    the implementation class for WALStreamReader in tests.
>
> PTAL if you have interest.
>
> Thanks.
>
> 唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 18:53写道:
>
>> Make sense. Thank you.
>>
>> 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月15日周三 18:24写道:
>>
>> > We could discuss how to implement BookKeeper based WAL in the future.
>> >
>> > For me, I would like to abstract at a higher level, as we do not need to
>> > have a 'file' when using BookKeeper. But this requires changing the
>> > replication abstraction too.
>> >
>> > Thanks.
>> >
>> > 唐天航 <tangtianhang...@gmail.com> 于2023年2月15日周三 11:42写道:
>> >
>> > > Can we keep these configs and add a new one for the replication
>> reader?
>> > >
>> > > As @Andrew said, One of the things we are doing is using BookKeeper
>> for
>> > WAL
>> > > storage. This depends on the configuration above. Although we are
>> > > developing based on branch-1, and it is too early to talk about
>> joining
>> > the
>> > > community, I'm not sure what the attitude of the community is,
>> whether it
>> > > is willing to accept implementations based on other storage in the
>> > future.
>> > >
>> > > Thanks.
>> > >
>> > >
>> > > 张铎(Duo Zhang) <palomino...@gmail.com> 于2023年2月14日周二 21:20写道:
>> > >
>> > > > Thanks Andrew for the feedback.
>> > > >
>> > > > If no other concerns, I will start the refactoring work and
>> deprecated
>> > > > these two configs.
>> > > >
>> > > > Thanks.
>> > > >
>> > > > Andrew Purtell <andrew.purt...@gmail.com> 于2023年2月10日周五 23:01写道:
>> > > >
>> > > > > As you point out these configuration settings were introduced
>> when we
>> > > > > migrated from SequenceFile based WALs to the protobuf format. We
>> > needed
>> > > > to
>> > > > > give users a way to manually migrate, although, arguably, an auto
>> > > > migration
>> > > > > would have been better.
>> > > > >
>> > > > > In theory these settings allow users to implement their own WAL
>> > readers
>> > > > > and writers. However I do not believe users will do this. The WAL
>> is
>> > > > > critical for performance and correctness. If anyone is
>> contemplating
>> > > such
>> > > > > wizard level changes they can patch the code themselves. It’s
>> fine to
>> > > > > document these settings as deprecated for sure, and I think ok
>> also
>> > to
>> > > > > claim them unsupported and ignored.
>> > > > >
>> > > > > >
>> > > > > > On Feb 10, 2023, at 3:41 AM, 张铎 <palomino...@gmail.com> wrote:
>> > > > > >
>> > > > > > While discussing how to deal with the problem in HBASE-27621,
>> we
>> > > > > proposed
>> > > > > > to introduce two types of WAL readers, one for WAL splitting,
>> and
>> > the
>> > > > > other
>> > > > > > for WAL replication, as replication needs to tail the WAL file
>> > which
>> > > is
>> > > > > > currently being written, so the logic is much more complicated.
>> We
>> > do
>> > > > not
>> > > > > > want to affect WAL splitting logic and performance while
>> tweaking
>> > the
>> > > > > > replication related things, as all HBase users need WAL
>> splitting
>> > but
>> > > > not
>> > > > > > everyone needs replication.
>> > > > > >
>> > > > > > But when reviewing the related code, I found that we have two
>> > > > > > configurations for specifying the WAL reader class and WAL write
>> > > class,
>> > > > > > which indicates that we could only have one implementation for
>> the
>> > > WAL
>> > > > > > reader. They are 'hbase.regionserver.hlog.reader.impl' and
>> > > > > > 'hbase.regionserver.hlog.writer.impl'.
>> > > > > >
>> > > > > > We mentioned these two configurations several times in our ref
>> > guide.
>> > > > > >
>> > > > > > HBase 2.0+ can no longer read Sequence File based WAL file.
>> > > > > >
>> > > > > > HBase can no longer read the deprecated WAL files written in the
>> > > Apache
>> > > > > >> Hadoop Sequence File format. The
>> > hbase.regionserver.hlog.reader.impl
>> > > > and
>> > > > > >> hbase.regionserver.hlog.writer.impl configuration entries
>> should
>> > be
>> > > > set
>> > > > > to
>> > > > > >> use the Protobuf based WAL reader / writer classes. This
>> > > > implementation
>> > > > > has
>> > > > > >> been the default since HBase 0.96, so legacy WAL files should
>> not
>> > > be a
>> > > > > >> concern for most downstream users.
>> > > > > >
>> > > > > >
>> > > > > > Configure WAL encryption.
>> > > > > >
>> > > > > > Configure WAL encryption in every RegionServer’s
>> hbase-site.xml, by
>> > > > > setting
>> > > > > >> the following properties. You can include these in the
>> HMaster’s
>> > > > > >> hbase-site.xml as well, but the HMaster does not have a WAL and
>> > will
>> > > > not
>> > > > > >> use them.
>> > > > > >> <property>
>> > > > > >>  <name>hbase.regionserver.hlog.reader.impl</name>
>> > > > > >>
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogReader</value>
>> > > > > >> </property>
>> > > > > >> <property>
>> > > > > >>  <name>hbase.regionserver.hlog.writer.impl</name>
>> > > > > >>
>> > > > > >>
>> > > > >
>> > > >
>> > >
>> >
>> <value>org.apache.hadoop.hbase.regionserver.wal.SecureProtobufLogWriter</value>
>> > > > > >> </property>
>> > > > > >> <property>
>> > > > > >>  <name>hbase.regionserver.wal.encryption</name>
>> > > > > >>  <value>true</value>
>> > > > > >> </property>
>> > > > > >
>> > > > > >
>> > > > > > So in fact, do not consider encryption, the configurations are
>> > > useless
>> > > > as
>> > > > > > we do not support reading sequence file format WAL any more, the
>> > only
>> > > > > valid
>> > > > > > options are protobuf based reader and write. And for security, I
>> > > think
>> > > > > the
>> > > > > > configuration is redundant as if encryption is enabled, we
>> should
>> > use
>> > > > > > SecureProtobufLogWriter for writing, no matter what the
>> > configuration
>> > > > > value
>> > > > > > is. And for readers, I do not think we should use a
>> configuration
>> > to
>> > > > > > specify the implementation, we should detect whether the file is
>> > > > > encrypted
>> > > > > > and choose a secure or normal reader to read the file.
>> > > > > >
>> > > > > > So here, I propose we just deprecated these two configurations
>> > > because
>> > > > > they
>> > > > > > are useless now.
>> > > > > >
>> > > > > > Thanks.
>> > > > >
>> > > >
>> > >
>> >
>>
>

Reply via email to