Re: Testing rollback after HDFS upgrade

Evans Ye Sat, 29 Aug 2020 12:19:39 -0700

Hi Luca,

I'm on vacation hence do not have time for review right now. I'll get back
to you next week.


The doc is definitely valuable. Once you have your production migrated
successfully. We can prove to the other users that this is a battle proven
solution. Even more, we can give a talk at ApacheCon or somewhere else to
further amplify the impact of the work. This is definitely an open source
winning case so I think it deserve a talk.

Evans


Luca Toscano <toscano.l...@gmail.com> 於 2020年8月27日 週四 下午9:11寫道：

> Hi Evans,
>
> it took a while I know but I have the first version of the gdoc for the
> upgrade:
>
>
> https://docs.google.com/document/d/1fI1mvbR1mFLV6ohU5cIEnU5hFvEE7EWnKYWOkF55jtE/edit?usp=sharing
>
> I tried to list all the steps involved in migrating from CDH 5 to
> Bigtop 1.4, anybody interested should be able to comment. The idea
> that I have is to discuss this for a few days and then possibly make
> it permanent somewhere in the Bigtop wiki? (of course if the document
> will be considered useful for others etc..)
>
> During these days I tested the procedure multiple times, and I have
> also tested the HDFS finalize step, everything works as expected. I
> hope to be able to move to Bigtop during the next couple of months.
>
> Luca
>
> On Tue, Jul 21, 2020 at 4:04 PM Evans Ye <evan...@apache.org> wrote:
> >
> > Yes. I think a shared gdoc is prefered, and you can open up a JIRA
> ticket to track it.
> >
> > Luca Toscano <toscano.l...@gmail.com> 於 2020年7月20日 週一 21:10 寫道：
> >>
> >> Hi Evans!
> >>
> >> What is the best medium to use for the documentation/comments ? A
> >> shared gdoc or something similar?
> >>
> >> Luca
> >>
> >> On Thu, Jul 16, 2020 at 5:11 PM Evans Ye <evan...@apache.org> wrote:
> >> >
> >> > One thing I think would be great to have is a doc version of the
> steps for upgrade and rollback. The benefits:
> >> > 1. Anything unexpected happened during automation, you do have folks
> can quickly understand what's going on and get into the investigation.
> >> > 2. Share the doc with us to help the others OSS users for doing the
> migration. For the env specific things I think that's fine. We can left
> comment on it. At least all the other users can get a high level view of a
> proven solution. And then they can go and find out the rest of the pieces
> by themselves.
> >> >
> >> > For automations, I suggest to split up the automation into several
> stages, and apply some validation steps(manually is ok) before kicking of
> the next stage.
> >> >
> >> > Best,
> >> > Evans
> >> >
> >> >
> >> >
> >> >
> >> > Luca Toscano <toscano.l...@gmail.com> 於 2020年7月15日 週三 下午9:07寫道：
> >> >>
> >> >> Hi everybody,
> >> >>
> >> >> I didn't get the time to work on this until recently, but I finally
> >> >> managed to have a reliable procedure to upgrade from CDH to Bigtop
> 1.4
> >> >> and rollback if needed. The assumptions are:
> >> >>
> >> >> 1) It is ok to have (limited) cluster downtime.
> >> >> 2) Rolling upgrade is not needed.
> >> >> 3) QJM is used.
> >> >>
> >> >> The procedure is listed in these two scripts:
> >> >>
> >> >>
> https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/hadoop/stop-cluster.py
> >> >>
> https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/hadoop/change-distro-from-cdh.py
> >> >>
> >> >> The code is highly dependent on my working environment, but it should
> >> >> be clear to follow when writing a tutorial about how to migrate from
> >> >> CDH to Bigtop. All the suggestions given by this mailing list were
> >> >> really useful to reach a solution!
> >> >>
> >> >> My next steps will be:
> >> >>
> >> >> 1) Keep testing Bigtop 1.4 (finalize HDFS upgrade, run more hadoop
> >> >> jobs, test Hive 2, etc..).
> >> >> 2) Upgrade the production Hadoop cluster to Bigtop 1.4 on Debian 9
> >> >> (HDFS 2.6.0-cdh -> 2.8.5).
> >> >> 3) Upgrade to Bigtop 1.5 on Debian 9 (HDFS 2.8.5 -> 2.10).
> >> >> 4) Upgrade to Debian 10.
> >> >>
> >> >> With automation it shouldn't be very difficult, I'll report progress
> once made.
> >> >>
> >> >> Thanks a lot!
> >> >>
> >> >> Luca
> >> >>
> >> >> On Mon, Apr 13, 2020 at 9:25 AM Luca Toscano <toscano.l...@gmail.com>
> wrote:
> >> >> >
> >> >> > Hi Evans,
> >> >> >
> >> >> > thanks a lot for the feedback, it was exactly what I needed. The
> >> >> > simpler the better is definitely a good advice in this use case,
> I'll
> >> >> > try this week another rollout/rollback and report back :)
> >> >> >
> >> >> > Luca
> >> >> >
> >> >> > On Thu, Apr 9, 2020 at 8:09 PM Evans Ye <evan...@apache.org>
> wrote:
> >> >> > >
> >> >> > > Hi Luca,
> >> >> > >
> >> >> > > Thanks for reporting back and let us know how it goes.
> >> >> > > I don't have the exactly HDFS with QJM HA upgrade experience.
> The experience I had was 0.20 non-HA upgrade to 2.0 non-HA and then enable
> QJM HA, which was back in 2014.
> >> >> > >
> >> >> > > Regarding to rollback, I think you're right:
> >> >> > >
> >> >> > > it is possible to rollback to HDFS’ state before the upgrade in
> case of unexpected problems.
> >> >> > >
> >> >> > > My previous experience is the same that the rollback is merely a
> snapshot before the upgrade. If you've gone far, then rollback cost more
> data lost... Our runbook is if our sanity check failed during upgrade
> downtime, we perform the rollback immediately.
> >> >> > >
> >> >> > > Regarding to that FSImage hole issue, I've experienced it as
> well.
> >> >> > > I managed to fix it by manually edit the FSImage with offline
> image viewer[1] and delete that missing editLog in FSImage. That actually
> brought my cluster back with a little number of missing blocks.
> >> >> > >
> >> >> > > Our experience says that the more the steps, the more the chance
> you failed the upgrade. We did good on dozen times of testing, DEV cluster,
> STAGING cluster, but still got missing blocks when upgrading Production...
> >> >> > >
> >> >> > > The suggestion is to get your production in good shape first(the
> less decommissioned, offline DNs, disk failures, the better).
> >> >> > > Also, maybe you can switch to non-HA mode and do the upgrade to
> simplify the things?
> >> >> > >
> >> >> > > Not many helps but please let us know if any progress.
> >> >> > > Last one, have you reached out to Hadoop community? the authors
> should know the most :)
> >> >> > >
> >> >> > > - Evans
> >> >> > >
> >> >> > > [1]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html
> >> >> > >
> >> >> > > Luca Toscano <toscano.l...@gmail.com> 於 2020年4月8日 週三 21:03 寫道：
> >> >> > >>
> >> >> > >> Hi everybody,
> >> >> > >>
> >> >> > >> most of the bugs/issues/etc.. that I found while upgrading from
> CDH 5
> >> >> > >> to BigTop 1.4 are fixed, I am now testing (as suggested also in
> here)
> >> >> > >> upgrade/rollback procedures for HDFS (all written in
> >> >> > >> https://phabricator.wikimedia.org/T244499, will add
> documentation
> >> >> > >> about this at the end I promise).
> >> >> > >>
> >> >> > >> I initially followed [1][2] in my Test cluster, choosing the
> Rolling
> >> >> > >> upgrade, but when I tried to rollback (after days since the
> initial
> >> >> > >> upgrade) I ended up in an inconsistent state and I wasn't able
> to
> >> >> > >> recover the previous HDFS state. I didn't save the exact error
> >> >> > >> messages but the situation was more or less the following:
> >> >> > >>
> >> >> > >> FS-Image-rollback (created at the time of the upgrade) - up to
> transaction X
> >> >> > >> FS-Image-current - up to transaction Y, with Y = X + 10000
> (number
> >> >> > >> totally made up for the example)
> >> >> > >> QJM cluster: first available transaction Z = X + 10000 + 1
> >> >> > >>
> >> >> > >> When I tried to rolling rollback, the Namenode complained about
> a hole
> >> >> > >> in the transaction log, namely at X + 1, so it refused to
> start. I
> >> >> > >> tried to force a regular rollback, but the Namenode refused
> again
> >> >> > >> saying that there was no available FS Image to roll back to. I
> checked
> >> >> > >> in the Hadoop code and indeed the Namenode saves the fs image
> with
> >> >> > >> different naming/path in case of a rolling upgrade or a regular
> >> >> > >> upgrade. Both cases make sense, especially the first one since
> there
> >> >> > >> was indeed a hole between the last transaction of the
> >> >> > >> FS-Image-rollback and the first available transaction to reply
> on the
> >> >> > >> QJM cluster. I chose the rolling upgrade initially since it was
> >> >> > >> appealing: it promises to bring back the Namenodes to their
> previous
> >> >> > >> versions, but keeping the data modified between upgrade and
> rollback.
> >> >> > >>
> >> >> > >> I then found [3], in which it is said that with QJM everything
> is more
> >> >> > >> complicated, and a regular rollback is the only option
> available. What
> >> >> > >> I think this mean is that due to the Edit log spread among
> multiple
> >> >> > >> nodes, a rollback that keeps data between upgrade and rollback
> is not
> >> >> > >> available, so worst case scenario the data modified during that
> >> >> > >> timeframe is lost. Not a big deal in my case, but I want to
> triple
> >> >> > >> check with you if this is the correct interpretation or if
> there is
> >> >> > >> another tutorial/guide/etc.. that I haven't read with a
> different
> >> >> > >> procedure :)
> >> >> > >>
> >> >> > >> Is my interpretation correct? If not, is there anybody with
> experience
> >> >> > >> in HDFS upgrades that could shed some light on the subject?
> >> >> > >>
> >> >> > >> Thanks in advance!
> >> >> > >>
> >> >> > >> Luca
> >> >> > >>
> >> >> > >>
> >> >> > >>
> >> >> > >> [1]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Upgrade_and_Rollback
> >> >> > >> [2]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
> >> >> > >> [3]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#HDFS_UpgradeFinalizationRollback_with_HA_Enabled
>

Re: Testing rollback after HDFS upgrade

Reply via email to