Thanks for your input, Andrew and Nick!

Big thank you to Duo for your hands-on-keyboard commitment as well for this whole feature.

I am also happy to target 2.x (and not 2.5.x) for the backport.

In the interest of getting rid of this feature branch (and the inevitable rebase pains the longer it runs parallel to master), I'd like to move ahead with a concrete plan to merge.

1. Given there was no objection, do folks feel the need for a VOTE? Even if one person would like a VOTE, I'm happy to start that. Please just say so.

2. We have three outstanding PRs for the sake of SFT which are all (IMO) very close to merging (#3851, #3861, and #3942). I think 3851 and 3942 are easy to include and just need one more review cycle. If we feel like we are still far away on 3861, I think we set that aside and revisit it after the feature merge is done.

If there are any other concerns, please shout!

- Josh

On 12/8/21 9:07 PM, Andrew Purtell wrote:
+1 for merging to branch-2 (2.6)

On Dec 8, 2021, at 6:04 PM, 张铎 <palomino...@gmail.com> wrote:

I think here we just want this to be backported to 2.x, not 2.5.x.

So thanks Andrew for the quick action.

+1 on merging HBASE-26067 to master and backporting to branch-2(2.6.0).

Thanks.

Andrew Purtell <apurt...@apache.org> 于2021年12月9日周四 08:45写道:

I concur with Nick, but let me help here by branching 2.5 today. It was
always going to be somewhat arbitrary a point.

On Wed, Dec 8, 2021 at 3:09 PM Nick Dimiduk <ndimi...@apache.org> wrote:

Based solely on the comments made to this thread, I would recommend
against
a merge to branch-2, given that we are very close to 2.5. The points
about
existing gaps seem like things we're not ready to publish in the
impending
minor release. Once we have a branch-2.5, this particular concern of mine
will be alleviated.

Thanks,
Nick

On Wed, Dec 8, 2021 at 1:37 PM Josh Elser <els...@apache.org> wrote:

I was going to wait for some other folks to chime in, but I guess I can
be the next one :)

Duo, Wellington, and Szabolcs have been doing some excellent work on
the
storefile tracking (SFT) to a degree that I never expected to see. I
remember some of the original "Filesystem re-do" issues on Jira. The
idea was exceptional, but the result seemed unreachable.

These devs, building on the success of what Zach/Stephen first talked
about in HBASE-24749, came up with what I think is an excellent step
forward. I've yet to break it via my own testing, but do acknowledge
that there's always more work to be done.

I think this is at a reasonable place to merge this back into the
"mainline" branches from the feature branch (HBASE-26067). I believe
this is ready because:

1. The feature is completely opt-in (HBase works the same way by
default)
2. There is API to migrate tables into the new SFT implementation
3. There is also API to migrate tables back to the default
implementation

Some gaps still exist around bulk loading, documentation, snapshots,
and
recovery tooling, but these are being worked on. In the context of S3,
this makes a significantly more compelling offering of HBase by
removing
the complexity of HBOSS. For HBase in all installations, I think SFT
makes more a significantly more "deterministic" way of managing
regions/files.

+1 from me to merge HBASE-26067 into master and branch-2

- Josh

On 12/7/21 10:31 AM, Wellington Chevreuil wrote:
Hello everyone,

We have been making progress on the alternative way of tracking store
files
originally proposed by Duo in HBASE-26067.

To briefly summarize it for those not following it, this feature
introduces
an abstraction layer to track store files still used/needed by store
engines, allowing for plugging different approaches of identifying
store
files required by the given store. The design doc describing it in
more
detail is available here
<


https://docs.google.com/document/d/16Nr1Fn3VaXuz1g1FTiME-bnGR3qVK5B-raXshOkDLcY/edit#heading=h.calrs3kn4d8s

.

Our main goal within this feature is to avoid the need for using temp
files
and renames when creating new hfiles (whenever flushing, compacting,
splitting/merging or snapshotting). This is made possible by the
pluggable
tracker implementation labeled "FILE". The current behavior using
temp
dirs
and renames would still be the default approach (labeled "DEFAULT").

This "renameless" approach is appealing for deployments using Amazon
S3
Object store file system, where the lack of atomic rename operations
imposed the necessity of an additional layer of locking (HBOSS),
which
combined with the s3a rename operation can have a performance
overhead.

Some test runs on my employer infrastructure have shown promising
results.
A pure insertion ycsb run has shown ~6% performance gain on the
client
writes. Snapshot clone of hundreds of regions table completes in half
of
the time. There are also improvements in compaction, splits and
merges
times.

Talking with Duo Zhang and Josh Elser in the HBASE-26067 jira, we
feel
optimistic that the current implementation is in a good state to get
merged
into master branch, but it would be nice to hear other opinions about
it,
before we effectively commit it. Looking forward to hearing some
thoughts/concerns you might have.

Kind regards,
Wellington.





--
Best regards,
Andrew

Words like orphans lost among the crosstalk, meaning torn from truth's
decrepit hands
   - A23, Crosstalk

Reply via email to