Thanks for response Duo. Unfortunately, we don't have the ability to check master page or to use HBCK2 for fixing issues as we have hundreds of clusters that we don't have access to and the requirement for them is to upgrade and come up automatically without human intervention. We also don't have an ability to rollback as I'd like to avoid shipping older version as we also upgrading hadoop to 3.1.3 and need to recompile older version to work on top of it.
So, I'm looking for some foolproof method of doing this upgrade which I realize might be a bit tricky in this case. Maybe it would be possible to write a script that would "prune" the MasterProcWALs of any procedure related to region movement? Also, if most of the time we stop the regionservers gracefully, it shouldn't be an issue to deal with SCP (however, there's no actual guarantee that they stop gracefully and not killed, but maybe for those extra cases we could just get on call with customers). On Tue, Mar 3, 2020 at 4:09 PM 张铎(Duo Zhang) <[email protected]> wrote: > The key point is to make sure that you do not have running procedures when > stopping the 2.1.4 cluster. This can be done by checking the master page. > And if there are unsupported procedures when starting 2.2.3, the master > will just quit without doing any damages so it is fine to just roll back to > 2.1.4. > > Removing the master proc WAL directory is a possible way to make sure the > master for 2.2.3 can be up normally, but maybe you need to make use of > HBCK2 to fix the cluster if you removed some important procedures such as > SCP. > > > > Andrey Elenskiy <[email protected]>于2020年3月4日 周三07:37写道: > > > Hello, > > > > We'd like to upgrade to from 2.1.4 to hbase 2.2.3 however we are shipping > > hbase along with our software on a regular schedule. Each upgrade of our > > software requires starting and stopping entire stack (hbase included), so > > we are fine with downtime. The following guide ( > > https://hbase.apache.org/book.html#upgrade2.2) seems to be tailored > > towards > > zero-time upgrades and requires 2.1 version of hbasemaster doing the > > draining of region procedures while regionservers are online before > > starting 2.2 hbasemaster. > > > > That process seems to be non trivial to automate in our declarative init > > system and requires shipping entire old hbase version for just an > upgrade. > > I was wondering if there are any alternatives if we are ok with downtime. > > Couple ideas: > > > > 1. Apply the patch ( > > > > > https://issues.apache.org/jira/secure/attachment/12944775/0001-HBASE-21075-Confirm-that-we-can-rolling-upgrade-from.patch > > ) > > to hbase 2.2. I'd prefer to not do it as we have to compile hbase > ourselves > > (which we already do, but it seems like throwaway work). > > 2. Delete MasterProcWALs directory after stopping entire hbase and before > > starting new hbase 2.2 master > > > > Any other suggestions? > > Andrey > > >
