[
https://issues.apache.org/jira/browse/HBASE-23326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16997737#comment-16997737
]
Duo Zhang commented on HBASE-23326:
-----------------------------------
{quote}
Thanks for putting up the doc. Can we have 'comment' access. Here are a few
notes in meantime (some carried over from github comments):
{quote}
Done.
{quote}
On layout, could have a 'master' namespace so master Region is in same place in
filesystem (Might be too awkward excluding this namespace from consideration in
general processing).
Or, here you make a MasterProcs directory. Old system had a MasterProcWALs dir.
Make instead a generic 'master' dir at top-level into which we put all stuff
master wants to persist to filesystem of which these new procedures WALs would
be first.
{quote}
The design here is to make the procedure store be self-managed. IMO, most data
should be stored in hbase:meta, or other system tables. And why we need this
special store is that, inializing and assigning meta depend on the procedure
framework so we can not use hbase:meta store these things. This should not be a
common case, so let's isolate it from the normal region store.
{quote}
You cannot pass a RegionServerServices that has the special implementations of
flush/compaction/rolling? Just to minimize how this Region implementation
deviates from the norm.
{quote}
It is a 'local' HRegion, so in general, it should not have a
RegionServerServices along with it. And in fact, if we pass a
RegionServerServices in, lot's of other features will be activated, such as
quota, metrics, etc. This will cause problem as if we do not enable table on
master, some of the components are not initialized. Of course, metrics is
useful here, but can be a follow on.
And in general, I think we can do some refactoring on HRegion, to make it
decouple with RegionServerServices, and for the optional features, we can add
some interface to make them pluggable, then the code here will be more clean.
But anyway, as said above, this should not be a common case, so do not need to
be hurry on the refactoring.
{quote}
WAL dirs will be deleted/cleanedup after WALs are moved to recovered.edits?
There'll be no accumulation of WALs? What about archiving? Peter figured how to
get the MasterProcWALs into the general WAL archive. Maybe no archiving of
these WALs?
{quote}
I do not think we can archive the WALs to the general place as if we enable
region on master, it will mess things up. Now I haven't take care of this part
yet, the intention is to just delete the WALs in the first version. And later,
we could implement our own archiving logic, it is easy I think. Anyway, the
design here is to be self-managed. And for tracing the problem, if we assume
that the HRegion and WAL framework are fine(If it is not fine, then we should
find out on the normal read/write path), then the problem should be in our code
which read/write to the HRegion. So maybe we could enable multi version and
keep deleted cells on this region, to make it more debug friendly.
{quote}
For recovered.edits, they content is supposed to be 'sorted'. When we move WALs
to recovered.edits, they will be 'sorted' because we write in procedure order?
Is there anything we need to do to ensure edits go into the WAL 'ordered'?
{quote}
Technically, they do not need to be 'sorted'. As there are sequece ids in the
WALEntry and we do not do compaction then replaying, order is not important.
And why we make them sorted is because performance. As when splitting we can
know all the sequence ids of the WALEntries contained in a splitted WAL file,
so we can just name it with the sequence ids. Then when replaying, we can
quickly filter out the unnecessary WAL files. But here, since we do not need to
split, it is not necessary to read the files again and rename them...
And that's why I use a different name of the directory to put these WAL files.
You can see the modification in HRegion, I added a special config to specify
the special directory to place recovered'edits. If this option is set, then the
logic is a bit different, where we will not filter out any WAL files, and do
not check its name parttern.
Thanks.
> Implement a ProcedureStore which stores procedures in a HRegion
> ---------------------------------------------------------------
>
> Key: HBASE-23326
> URL: https://issues.apache.org/jira/browse/HBASE-23326
> Project: HBase
> Issue Type: Improvement
> Components: proc-v2
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
>
> So we can resue the code in HRegion for persisting the procedures, and also
> the optimized WAL implementation for better performance.
> This requires we merge the hbase-procedure module to hbase-server, which is
> an anti-pattern as we make the hbase-server module more overloaded. But I
> think later we can first try to move the WAL stuff out.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)