We are not paying for CDH -- our older version of CDH (5.16.2) was
pre-licensing. We've never used CM. We are planning to migrate off of CDH
onto apache, and have 10+ years of experience working with HBase internals
and operating HBase at scale. I'm curious if anyone has knowledge of any
incompatibilities in the replication layer between these 2 versions, as
that is not very well covered in the public docs afaict. I'm aware this
will likely be a multi-month or year+ long project for us, and am just
starting the investigation phase :) It honestly looks like it might be an
easier project than the pre-0.96 to 1.x upgrade we undertook years ago,
though we're at a different scale today.

On Wed, May 19, 2021 at 9:17 AM Marc Hoppins <marc.hopp...@eset.sk> wrote:

> If you are paying for CDH then just upgrade via cloudera manager. If you
> are not paying for it then I think you will find it a huge problem.
>
> Upgade may have to be done using a version 6 then a newer version to get
> to a suitable Hbase/Hadoop version.
>
> We are currently on CDH6.3.2 but the Hbase is an extremely useless version
> (2.1.0) and we are not in the business of generating income from the data
> so cannot justify the exorbitant cost per node that cloudera are asking for
> later versions.
>
> -----Original Message-----
> From: Bryan Beaudreault <bbeaudrea...@hubspot.com.INVALID>
> Sent: Wednesday, May 19, 2021 2:49 PM
> To: user@hbase.apache.org
> Subject: Upgrading cdh5.16.2 to apache hbase 2.4 using replication
>
> EXTERNAL
>
> We are running about 40 HBase clusters, with over 5000 regionservers total.
> These are all running cdh5.16.2. We also have thousands of clients (from
> APIs to kafka workers to hadoop jobs, etc) hitting these various clusters,
> also running cdh5.16.2.
>
> We are starting to plan an upgrade to hbase 2.x and hadoop 3.x. I've read
> through the docs on https://hbase.apache.org/book.html#_upgrade_paths
> <https://hbase.apache.org/book.html#_upgrade_paths>,
> and am starting to plan our approach. More than a few seconds of downtime
> is not an option, but rolling upgrade also seems risky (if not impossible
> for our version).
>
> One thought I had is whether replication is compatible between these two
> versions. If so, we probably would consider swapping onto upgraded clusters
> using backup/restore + replication. If we were to go this route we'd
> probably want to consider bi-directional replication so that we can roll
> back to the old cluster if there's a regression.
>
> Does anyone have any experience with this approach? Is replication
> protocol compatible across the seversions? Any concerns, tips or other
> considerations to keep in mind? We do the backup/restore + replication
> approach pretty regularly to move tables between clusters.
>
> Thanks!
>

Reply via email to