My understanding on the motivation of moving hbck2 (and potentially other
operation related tools) to a separate project was to allow it to evolve on
its own cadence, as new interventions to fix previously unforeseen problems
were identified, we would now be able to deliver updated tool versions
without requiring a new release on the main projects, therefore avoiding
the need of a cluster upgrade.

At same time, the current pattern for hbck2 at least is to avoid giving too
much control of regions assignment inconsistencies over to the client,
favouring an approach where hbck2 client mainly act as a proxy for fix
methods exposed at master RPC interface. That way, master is able to keep
track of all procedures manipulating regions, where an independent process
manipulating regions at same time could cause more problems than fix it (we
saw this happening many times with misuse of hbck v1 methods).

However, with such restriction we are again limiting the ability to evolve
independently of main project releases and cluster upgrades. With growing
adoption of branch-2 based releases by our customer base, there's been a
surge on assignment issues, which can't be fixed solely with the existing
hbck2 commands. One example I had is what motivated HBASE-22143. I would
favour the approach of allowing more complex client methods, mainly because
it allows for adding new fixes on demand, but there's also value on keeping
things owned by master.

Am interested to hear what's the general opinion on the way to move forward
with hbck2? Should we completely discourage client bound logic patches such
as the one from HBASE-22143 ?

Reply via email to