Re: Testing rollback after HDFS upgrade

Evans Ye Sun, 27 Sep 2020 23:49:00 -0700

Oh ok. That sounds great!

Luca Toscano <toscano.l...@gmail.com> 於 2020年9月28日 週一 14:31 寫道：


> Hi Evans,
>
> what I meant with a blog post shared would be something that goes in
> http://techblog.wikimedia.org/ and on
> https://blogs.apache.org/bigtop/, stating that we collaborated and how
> :)
>
> Luca
>
> On Mon, Sep 21, 2020 at 5:44 PM Evans Ye <evan...@apache.org> wrote:
> >
> > Yes. Overall it sounds great to me!
> >
> > I think the  "summary of known pitfalls/bugs/etc.." section is worth to
> add and might be a super valuable part of the whole thing.
> >
> > | "The Blog post would be a good idea, maybe something that we can share
> between Wikimedia and Apache"
> > What do you mean by this one, specifically? Currently 3 things we can in
> below. Do they match what you think or it's something else?
> >
> > 1. Bigtop wiki/blogs:
> > https://cwiki.apache.org/confluence/display/BIGTOP/Index
> > https://blogs.apache.org/bigtop/
> >
> > 2. Success At Apache:
> > https://blogs.apache.org/foundation/category/SuccessAtApache
> >
> > 3. ApacheCon Talk (this year CFP is over, we can do it next year as a
> post production expereince sharing)
> > https://apachecon.com/index.html
> >
> > - Evans
> >
> >
> > Luca Toscano <toscano.l...@gmail.com> 於 2020年9月20日 週日 下午4:55寫道：
> >>
> >> Hi Evans,
> >>
> >> I am late in answering as well :)
> >>
> >> I thought about it and I think that with the right premises (example:
> >> this is tailored for Wikimedia's environment, it assumes that a
> >> cluster downtime is acceptable, etc..) the storytelling style might be
> >> more easy to digest than a list of steps to follow. I think that in
> >> all use cases different from Wikimedia there will be adjustments to
> >> make, and things that work/don't-work/etc.. One thing that it might be
> >> good to add at the end is a "summary of known pitfalls/bugs/etc.."
> >> found during the procedure, that in my case were the most
> >> time-consuming ones. I'll add it during the next few days and people
> >> can comment :)
> >>
> >> The Blog post would be a good idea, maybe something that we can share
> >> between Wikimedia and Apache? I am planning to move to BigTop during
> >> the upcoming quarter (October -> December), that will also show if my
> >> procedure works on a cluster of 60+ nodes (rather than on a small one
> >> of 8 nodes) :D. As soon as I have done it I'll follow up with this
> >> list so organize a blog post, does it sound ok?
> >>
> >> Thanks a lot for all the support!
> >>
> >> Luca
> >>
> >> On Tue, Sep 15, 2020 at 6:06 PM Evans Ye <evan...@apache.org> wrote:
> >> >
> >> > Hey Luca,
> >> >
> >> > Sorry for the late reply. I was busy for a conference. It's just over
> now.
> >> > Anyway, I  think the writing is pretty informative. But it's more
> like a storytelling style. Also several contents are WikiMedia specific
> things. That's why I think it's more suitable for a blogpost.
> >> >
> >> > Anyhow, I think either way it's great content. If we keep it as is, I
> think we can make it available on Bigtop's WIKI & Blog, or even Success at
> Apache with the title like "WikiMedia's story to migrate from CDH to
> Bigtop". If you want to make it more like an official guide, the title will
> be "CDH to Bigtop Migration Guide". We can state the limitation  and
> environment so that people can take it w/ a caution that it might not suit
> their own environment.
> >> >
> >> > Which way to go depends on how much effort you'd like to take. Let me
> know what you think so that we can move forward.
> >> >
> >> > - Evans
> >> >
> >> > Luca Toscano <toscano.l...@gmail.com> 於 2020年9月7日 週一 下午3:39寫道：
> >> >>
> >> >> Hi Evans,
> >> >>
> >> >> thanks for the review! What are the things that you'd like to see to
> >> >> make them more consumable for users? I can re-shape the writing, I
> >> >> tried to come up with something to kick off a conversation with the
> >> >> community, it would be interesting to know if anybody else has a
> >> >> similar use case and how/if they are working on a solution.
> >> >>
> >> >> For the blogpost, maybe we can coordinate something shared between
> >> >> Apache and Wikimedia when the migration is done, I am sure it would
> be
> >> >> a nice example of the two Foundations collaborating :)
> >> >>
> >> >> Luca
> >> >>
> >> >> On Wed, Sep 2, 2020 at 8:21 PM Evans Ye <evan...@apache.org> wrote:
> >> >> >
> >> >> > Hi Luca,
> >> >> >
> >> >> > I read through the doc briefly. I think the doc works very well as
> a blogpost of a successful story for Wikimedia migrating from CDH to
> Bigtop. However, the current writing doesn't seem to be easily consumable
> for users' who are just looking into the solutions/steps for doing similar
> migrations. May I know what title you would prefer if we put the doc in
> Bigtop's wiki?
> >> >> >
> >> >> > What I was thinking is the cookbook for migration. But we can
> discuss this. IMHO a Success at Apache[1] blogpost is also possible. But I
> need to figure out who to talk to. Let me know what you think.
> >> >> >
> >> >> > [1] https://blogs.apache.org/foundation/category/SuccessAtApache
> >> >> >
> >> >> > Evans
> >> >> >
> >> >> > Evans Ye <evan...@apache.org> 於 2020年8月30日 週日 上午3:18寫道：
> >> >> >>
> >> >> >> Hi Luca,
> >> >> >>
> >> >> >> I'm on vacation hence do not have time for review right now. I'll
> get back to you next week.
> >> >> >>
> >> >> >> The doc is definitely valuable. Once you have your production
> migrated successfully. We can prove to the other users that this is a
> battle proven solution. Even more, we can give a talk at ApacheCon or
> somewhere else to further amplify the impact of the work. This is
> definitely an open source winning case so I think it deserve a talk.
> >> >> >>
> >> >> >> Evans
> >> >> >>
> >> >> >>
> >> >> >> Luca Toscano <toscano.l...@gmail.com> 於 2020年8月27日 週四 下午9:11寫道：
> >> >> >>>
> >> >> >>> Hi Evans,
> >> >> >>>
> >> >> >>> it took a while I know but I have the first version of the gdoc
> for the upgrade:
> >> >> >>>
> >> >> >>>
> https://docs.google.com/document/d/1fI1mvbR1mFLV6ohU5cIEnU5hFvEE7EWnKYWOkF55jtE/edit?usp=sharing
> >> >> >>>
> >> >> >>> I tried to list all the steps involved in migrating from CDH 5 to
> >> >> >>> Bigtop 1.4, anybody interested should be able to comment. The
> idea
> >> >> >>> that I have is to discuss this for a few days and then possibly
> make
> >> >> >>> it permanent somewhere in the Bigtop wiki? (of course if the
> document
> >> >> >>> will be considered useful for others etc..)
> >> >> >>>
> >> >> >>> During these days I tested the procedure multiple times, and I
> have
> >> >> >>> also tested the HDFS finalize step, everything works as
> expected. I
> >> >> >>> hope to be able to move to Bigtop during the next couple of
> months.
> >> >> >>>
> >> >> >>> Luca
> >> >> >>>
> >> >> >>> On Tue, Jul 21, 2020 at 4:04 PM Evans Ye <evan...@apache.org>
> wrote:
> >> >> >>> >
> >> >> >>> > Yes. I think a shared gdoc is prefered, and you can open up a
> JIRA ticket to track it.
> >> >> >>> >
> >> >> >>> > Luca Toscano <toscano.l...@gmail.com> 於 2020年7月20日 週一 21:10
> 寫道：
> >> >> >>> >>
> >> >> >>> >> Hi Evans!
> >> >> >>> >>
> >> >> >>> >> What is the best medium to use for the documentation/comments
> ? A
> >> >> >>> >> shared gdoc or something similar?
> >> >> >>> >>
> >> >> >>> >> Luca
> >> >> >>> >>
> >> >> >>> >> On Thu, Jul 16, 2020 at 5:11 PM Evans Ye <evan...@apache.org>
> wrote:
> >> >> >>> >> >
> >> >> >>> >> > One thing I think would be great to have is a doc version
> of the steps for upgrade and rollback. The benefits:
> >> >> >>> >> > 1. Anything unexpected happened during automation, you do
> have folks can quickly understand what's going on and get into the
> investigation.
> >> >> >>> >> > 2. Share the doc with us to help the others OSS users for
> doing the migration. For the env specific things I think that's fine. We
> can left comment on it. At least all the other users can get a high level
> view of a proven solution. And then they can go and find out the rest of
> the pieces by themselves.
> >> >> >>> >> >
> >> >> >>> >> > For automations, I suggest to split up the automation into
> several stages, and apply some validation steps(manually is ok) before
> kicking of the next stage.
> >> >> >>> >> >
> >> >> >>> >> > Best,
> >> >> >>> >> > Evans
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> >
> >> >> >>> >> > Luca Toscano <toscano.l...@gmail.com> 於 2020年7月15日 週三
> 下午9:07寫道：
> >> >> >>> >> >>
> >> >> >>> >> >> Hi everybody,
> >> >> >>> >> >>
> >> >> >>> >> >> I didn't get the time to work on this until recently, but
> I finally
> >> >> >>> >> >> managed to have a reliable procedure to upgrade from CDH
> to Bigtop 1.4
> >> >> >>> >> >> and rollback if needed. The assumptions are:
> >> >> >>> >> >>
> >> >> >>> >> >> 1) It is ok to have (limited) cluster downtime.
> >> >> >>> >> >> 2) Rolling upgrade is not needed.
> >> >> >>> >> >> 3) QJM is used.
> >> >> >>> >> >>
> >> >> >>> >> >> The procedure is listed in these two scripts:
> >> >> >>> >> >>
> >> >> >>> >> >>
> https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/hadoop/stop-cluster.py
> >> >> >>> >> >>
> https://github.com/wikimedia/operations-cookbooks/blob/master/cookbooks/sre/hadoop/change-distro-from-cdh.py
> >> >> >>> >> >>
> >> >> >>> >> >> The code is highly dependent on my working environment,
> but it should
> >> >> >>> >> >> be clear to follow when writing a tutorial about how to
> migrate from
> >> >> >>> >> >> CDH to Bigtop. All the suggestions given by this mailing
> list were
> >> >> >>> >> >> really useful to reach a solution!
> >> >> >>> >> >>
> >> >> >>> >> >> My next steps will be:
> >> >> >>> >> >>
> >> >> >>> >> >> 1) Keep testing Bigtop 1.4 (finalize HDFS upgrade, run
> more hadoop
> >> >> >>> >> >> jobs, test Hive 2, etc..).
> >> >> >>> >> >> 2) Upgrade the production Hadoop cluster to Bigtop 1.4 on
> Debian 9
> >> >> >>> >> >> (HDFS 2.6.0-cdh -> 2.8.5).
> >> >> >>> >> >> 3) Upgrade to Bigtop 1.5 on Debian 9 (HDFS 2.8.5 -> 2.10).
> >> >> >>> >> >> 4) Upgrade to Debian 10.
> >> >> >>> >> >>
> >> >> >>> >> >> With automation it shouldn't be very difficult, I'll
> report progress once made.
> >> >> >>> >> >>
> >> >> >>> >> >> Thanks a lot!
> >> >> >>> >> >>
> >> >> >>> >> >> Luca
> >> >> >>> >> >>
> >> >> >>> >> >> On Mon, Apr 13, 2020 at 9:25 AM Luca Toscano <
> toscano.l...@gmail.com> wrote:
> >> >> >>> >> >> >
> >> >> >>> >> >> > Hi Evans,
> >> >> >>> >> >> >
> >> >> >>> >> >> > thanks a lot for the feedback, it was exactly what I
> needed. The
> >> >> >>> >> >> > simpler the better is definitely a good advice in this
> use case, I'll
> >> >> >>> >> >> > try this week another rollout/rollback and report back :)
> >> >> >>> >> >> >
> >> >> >>> >> >> > Luca
> >> >> >>> >> >> >
> >> >> >>> >> >> > On Thu, Apr 9, 2020 at 8:09 PM Evans Ye <
> evan...@apache.org> wrote:
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Hi Luca,
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Thanks for reporting back and let us know how it goes.
> >> >> >>> >> >> > > I don't have the exactly HDFS with QJM HA upgrade
> experience. The experience I had was 0.20 non-HA upgrade to 2.0 non-HA and
> then enable QJM HA, which was back in 2014.
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Regarding to rollback, I think you're right:
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > it is possible to rollback to HDFS’ state before the
> upgrade in case of unexpected problems.
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > My previous experience is the same that the rollback
> is merely a snapshot before the upgrade. If you've gone far, then rollback
> cost more data lost... Our runbook is if our sanity check failed during
> upgrade downtime, we perform the rollback immediately.
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Regarding to that FSImage hole issue, I've experienced
> it as well.
> >> >> >>> >> >> > > I managed to fix it by manually edit the FSImage with
> offline image viewer[1] and delete that missing editLog in FSImage. That
> actually brought my cluster back with a little number of missing blocks.
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Our experience says that the more the steps, the more
> the chance you failed the upgrade. We did good on dozen times of testing,
> DEV cluster, STAGING cluster, but still got missing blocks when upgrading
> Production...
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > The suggestion is to get your production in good shape
> first(the less decommissioned, offline DNs, disk failures, the better).
> >> >> >>> >> >> > > Also, maybe you can switch to non-HA mode and do the
> upgrade to simplify the things?
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Not many helps but please let us know if any progress.
> >> >> >>> >> >> > > Last one, have you reached out to Hadoop community?
> the authors should know the most :)
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > - Evans
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > [1]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsImageViewer.html
> >> >> >>> >> >> > >
> >> >> >>> >> >> > > Luca Toscano <toscano.l...@gmail.com> 於 2020年4月8日 週三
> 21:03 寫道：
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> Hi everybody,
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> most of the bugs/issues/etc.. that I found while
> upgrading from CDH 5
> >> >> >>> >> >> > >> to BigTop 1.4 are fixed, I am now testing (as
> suggested also in here)
> >> >> >>> >> >> > >> upgrade/rollback procedures for HDFS (all written in
> >> >> >>> >> >> > >> https://phabricator.wikimedia.org/T244499, will add
> documentation
> >> >> >>> >> >> > >> about this at the end I promise).
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> I initially followed [1][2] in my Test cluster,
> choosing the Rolling
> >> >> >>> >> >> > >> upgrade, but when I tried to rollback (after days
> since the initial
> >> >> >>> >> >> > >> upgrade) I ended up in an inconsistent state and I
> wasn't able to
> >> >> >>> >> >> > >> recover the previous HDFS state. I didn't save the
> exact error
> >> >> >>> >> >> > >> messages but the situation was more or less the
> following:
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> FS-Image-rollback (created at the time of the
> upgrade) - up to transaction X
> >> >> >>> >> >> > >> FS-Image-current - up to transaction Y, with Y = X +
> 10000 (number
> >> >> >>> >> >> > >> totally made up for the example)
> >> >> >>> >> >> > >> QJM cluster: first available transaction Z = X +
> 10000 + 1
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> When I tried to rolling rollback, the Namenode
> complained about a hole
> >> >> >>> >> >> > >> in the transaction log, namely at X + 1, so it
> refused to start. I
> >> >> >>> >> >> > >> tried to force a regular rollback, but the Namenode
> refused again
> >> >> >>> >> >> > >> saying that there was no available FS Image to roll
> back to. I checked
> >> >> >>> >> >> > >> in the Hadoop code and indeed the Namenode saves the
> fs image with
> >> >> >>> >> >> > >> different naming/path in case of a rolling upgrade or
> a regular
> >> >> >>> >> >> > >> upgrade. Both cases make sense, especially the first
> one since there
> >> >> >>> >> >> > >> was indeed a hole between the last transaction of the
> >> >> >>> >> >> > >> FS-Image-rollback and the first available transaction
> to reply on the
> >> >> >>> >> >> > >> QJM cluster. I chose the rolling upgrade initially
> since it was
> >> >> >>> >> >> > >> appealing: it promises to bring back the Namenodes to
> their previous
> >> >> >>> >> >> > >> versions, but keeping the data modified between
> upgrade and rollback.
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> I then found [3], in which it is said that with QJM
> everything is more
> >> >> >>> >> >> > >> complicated, and a regular rollback is the only
> option available. What
> >> >> >>> >> >> > >> I think this mean is that due to the Edit log spread
> among multiple
> >> >> >>> >> >> > >> nodes, a rollback that keeps data between upgrade and
> rollback is not
> >> >> >>> >> >> > >> available, so worst case scenario the data modified
> during that
> >> >> >>> >> >> > >> timeframe is lost. Not a big deal in my case, but I
> want to triple
> >> >> >>> >> >> > >> check with you if this is the correct interpretation
> or if there is
> >> >> >>> >> >> > >> another tutorial/guide/etc.. that I haven't read with
> a different
> >> >> >>> >> >> > >> procedure :)
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> Is my interpretation correct? If not, is there
> anybody with experience
> >> >> >>> >> >> > >> in HDFS upgrades that could shed some light on the
> subject?
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> Thanks in advance!
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> Luca
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >>
> >> >> >>> >> >> > >> [1]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsUserGuide.html#Upgrade_and_Rollback
> >> >> >>> >> >> > >> [2]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
> >> >> >>> >> >> > >> [3]
> https://hadoop.apache.org/docs/r2.8.5/hadoop-project-dist/hadoop-hdfs/HDFSHighAvailabilityWithQJM.html#HDFS_UpgradeFinalizationRollback_with_HA_Enabled
>

Re: Testing rollback after HDFS upgrade

Reply via email to