Re: 0.94 going forward

Nick Dimiduk Tue, 16 Dec 2014 13:14:20 -0800

On Tue, Dec 16, 2014 at 11:53 AM, Ted Yu <[email protected]> wrote:


> However, the application needs to compiled with 0.98 jars because the RPC
> has changed.
>

It should be possible to build an application that's using both client
library versions using something like JarJar [0] to munge class names. I
believe Jeffrey has done exactly this in the replication bridge [1].

[0]: https://code.google.com/p/jarjar/
[1]: https://github.com/hortonworks/HBaseReplicationBridgeServer

On Tue, Dec 16, 2014 at 11:42 AM, Esteban Gutierrez <[email protected]>
> wrote:
> >
> > +1 Andrew. Is not a simple task and is error prone and can cause dataloss
> > if not performed correctly, also we don't have tooling to fix broken
> > snapshots if moved manually.
> >
> > BTW 0.98 should migrate an old snapshot dir to the new post-namespaces
> > directory hierarchy after starting HBase from a 0.94 layout. If the goal
> is
> > to minimize downtime probably this should be a better approach: bootstrap
> > the destination cluster with 0.94 with snapshots and replication enabled,
> > then use the ExportSnapshot tool to copy the snapshots, import the
> > snapshots and use replication on the remote cluster until the delta is
> > minimal. Then stop the destination cluster and upgrade to 0.98 (that
> should
> > take care of migrating everything without user intervention). Once the
> > destination cluster is migrated to 0.98 then use the replication bridge
> > tool to catch up with the new edits and reduce the delta between both
> > clusters. Then just fail over the applications to the 0.98 cluster. The
> > repeat the upgrade process in the 0.94 cluster.
> >
> > my 2 cents,
> > esteban.
> >
> >
> >
> >
> >
> >
> > --
> > Cloudera, Inc.
> >
> >
> > On Tue, Dec 16, 2014 at 9:37 AM, Andrew Purtell <[email protected]>
> > wrote:
> > >
> > > I disagree. Before adding something like that to the ref guide, we
> should
> > > actually agree to support it as a migration strategy. We're not there
> > yet.
> > > And although it's a heroic process, we can take steps to make it less
> > > kludgy if so.
> > >
> > > On Tue, Dec 16, 2014 at 9:27 AM, Ted Yu <[email protected]> wrote:
> > > >
> > > > Good summary, Brian.
> > > >
> > > > This should be added to ref guide.
> > > >
> > > > Cheers
> > > >
> > > > On Tue, Dec 16, 2014 at 4:17 AM, Brian Jeltema <
> > > > [email protected]> wrote:
> > > > >
> > > > > I have been able to export snapshots from 0.94 to 0.98. I’ve pasted
> > the
> > > > > instructions that I developed
> > > > >
> > > > > and published on our internal wiki. I also had to significantly
> > > increase
> > > > > retry count parameters due to
> > > > >
> > > > > a high number of timeout failures during the export.
> > > > >
> > > > >
> > > > > Cross-cluster transfers
> > > > >
> > > > > To export a snapshot to a different cluster:
> > > > >
> > > > >    hbase org.apache.hadoop.hbase.snapshot.ExportSnapshot -snapshot
> > > snappy
> > > > > -copy-to proto://remhost/apps/hbase/data -mappers nummaps
> > > > >
> > > > > where snappy is the local snapshot to export, remhost is the host
> > name
> > > > the
> > > > > default filesystem in the remote cluster which is the target of the
> > > > export
> > > > > (i.e. the value of fs.defaultFS), nummaps is the number of mappers
> to
> > > run
> > > > > to perform the export, and proto is the protocol to use, either
> hftp,
> > > > hdfs
> > > > > or webhdfs. Use hdfs if the clusters are compatible. Run this as
> the
> > > > hbase
> > > > > user. If you see exceptions being thrown during the transfer
> related
> > to
> > > > > lease expirations, reduce the number of mappers or try using the
> > > > -bandwidth
> > > > > option. You may also see many "File does not exist" warnings in the
> > log
> > > > > output. These can be displayed continuously for several minutes
> for a
> > > > large
> > > > > table, but appear to be noise and can be ignored, so be patient.
> > > However,
> > > > > it is also very common for this command to fail due to a variety of
> > > file
> > > > > ownership conflicts, so you may need to fiddle to get everything
> > > right. A
> > > > > failure of this command often leaves garbage on the target system
> > that
> > > > must
> > > > > be deleted; if that is the case, the command will fail with info on
> > > what
> > > > > needs to be cleaned up.
> > > > >
> > > > > If exporting from an HBase 0.94 cluster to an HBase 0.98 cluster,
> you
> > > > will
> > > > > need to use the webhdfs protocol (or possibly hftp, though I
> couldn’t
> > > get
> > > > > that to work). You also need to manually move some files around
> > because
> > > > > snapshot layouts have changed. Based on the example above, on the
> > 0.98
> > > > > cluster do the following:
> > > > >
> > > > >    check whether any imports already exist for the table:
> > > > >
> > > > >   hadoop fs -ls /apps/hbase/data/archive/data/default
> > > > >
> > > > >   and check whether  snappyTable is listed, where snappyTable is
> the
> > > > > source of the snapshot (e.g. hosts). If the source table is listed,
> > > then
> > > > > merge the new ssnapshot data into the existing snapshot data:
> > > > >
> > > > >   hadoop fs -mv /apps/hbase/data/.archive/snappyTable/*
> > > > > /apps/hbase/data/archive/data/default/snappyTable
> > > > >
> > > > >     hadoop fs -rm r /apps/hbase/data/.archive/snappyTable
> > > > >
> > > > > otherwise, create and populate the snapshot data directory:
> > > > >
> > > > >   hadoop fs -mv /apps/hbase/data/.archive/snappyTable
> > > > > /apps/hbase/data/archive/data/default (snappyTable is the source of
> > > > snappy)
> > > > >
> > > > > in either case, update the snapshot metadata files as follows:
> > > > >
> > > > >      hadoop fs -mkdir
> > > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc
> > > > >
> > > > >   hadoop fs -mv
> > > > > /apps/hbase/data/.hbase-snapshot/snappy/.tableinfo.0000000001
> > > > > /apps/hbase/data/.hbase-snapshot/snappy/.tabledesc
> > > > >
> > > > > at this point, you should be able to do  a restore_snapshot from
> the
> > > > HBase
> > > > > shell.
> > > > >
> > > > >
> > > > > On Dec 15, 2014, at 8:52 PM, lars hofhansl <[email protected]>
> wrote:
> > > > >
> > > > > > Nope :(Replication uses RPC and that was changed to protobufs.
> > AFAIK
> > > > > snapshots can also not be exported from 0.94 and 0.98. We have a
> > really
> > > > > shitty story here.      From: Sean Busbey <[email protected]>
> > > > > > To: user <[email protected]>
> > > > > > Sent: Monday, December 15, 2014 5:04 PM
> > > > > > Subject: Re: 0.94 going forward
> > > > > >
> > > > > > Does replication and snapshot export work from 0.94.6+ to a 0.96
> or
> > > > 0.98
> > > > > > cluster?
> > > > > >
> > > > > > Presuming it does, shouldn't a site be able to use a multiple
> > cluster
> > > > set
> > > > > > up to do a cut over of a client application?
> > > > > >
> > > > > > That doesn't help with needing downtime for to do the eventual
> > > upgrade,
> > > > > but
> > > > > > it mitigates the impact on the downstream app.
> > > > > >
> > > > > > --
> > > > > > Sean
> > > > > >
> > > > > >
> > > > > > On Dec 15, 2014 6:51 PM, "Jeremy Carroll" <[email protected]>
> > > wrote:
> > > > > >
> > > > > >> Which is why I feel that a lot of customers are still on 0.94.
> > > Pretty
> > > > > much
> > > > > >> trapped unless you want to take downtime for your site. Any type
> > of
> > > > > >> guidance would be helpful. We are currently in the process of
> > > > designing
> > > > > our
> > > > > >> own system to deal with this.
> > > > > >>
> > > > > >> On Mon, Dec 15, 2014 at 4:47 PM, Andrew Purtell <
> > > [email protected]>
> > > > > >> wrote:
> > > > > >>>
> > > > > >>> Zero downtime upgrade from 0.94 won't be possible. See
> > > > > >>> http://hbase.apache.org/book.html#d0e5199
> > > > > >>>
> > > > > >>>
> > > > > >>> On Mon, Dec 15, 2014 at 4:44 PM, Jeremy Carroll <
> > > [email protected]
> > > > >
> > > > > >>> wrote:
> > > > > >>>>
> > > > > >>>> Looking for guidance on how to do a zero downtime upgrade from
> > > 0.94
> > > > ->
> > > > > >>> 0.98
> > > > > >>>> (or 1.0 if it launches soon). As soon as we can figure this
> out,
> > > we
> > > > > >> will
> > > > > >>>> migrate over.
> > > > > >>>>
> > > > > >>>> On Mon, Dec 15, 2014 at 1:37 PM, Esteban Gutierrez <
> > > > > >> [email protected]
> > > > > >>>>
> > > > > >>>> wrote:
> > > > > >>>>>
> > > > > >>>>> Hi Lars,
> > > > > >>>>>
> > > > > >>>>> Thanks for bringing this for discussion. From my experience I
> > can
> > > > > >> tell
> > > > > >>>> that
> > > > > >>>>> 0.94 is very stable but that shouldn't be a blocker to
> consider
> > > to
> > > > > >>>> EOL'ing.
> > > > > >>>>> Are you considering any specific timeframe for that?
> > > > > >>>>>
> > > > > >>>>> thanks,
> > > > > >>>>> esteban.
> > > > > >>>>>
> > > > > >>>>> --
> > > > > >>>>> Cloudera, Inc.
> > > > > >>>>>
> > > > > >>>>>
> > > > > >>>>> On Mon, Dec 15, 2014 at 11:46 AM, Koert Kuipers <
> > > [email protected]
> > > > >
> > > > > >>>> wrote:
> > > > > >>>>>>
> > > > > >>>>>> given that CDH4 is hbase 0.94 i dont believe nobody is using
> > it.
> > > > > >> for
> > > > > >>>> our
> > > > > >>>>>> clients the majority is on 0.94 (versus 0.96 and up).
> > > > > >>>>>>
> > > > > >>>>>> so i am going with 1), its very stable!
> > > > > >>>>>>
> > > > > >>>>>> On Mon, Dec 15, 2014 at 1:53 PM, lars hofhansl <
> > > [email protected]>
> > > > > >>>> wrote:
> > > > > >>>>>>>
> > > > > >>>>>>> Over the past few months the rate of the change into 0.94
> has
> > > > > >>> slowed
> > > > > >>>>>>> significantly.
> > > > > >>>>>>> 0.94.25 was released on Nov 15th, and since then we had
> only
> > 4
> > > > > >>>> changes.
> > > > > >>>>>>>
> > > > > >>>>>>> This could mean two things: (1) 0.94 is very stable now or
> > (2)
> > > > > >>> nobody
> > > > > >>>>> is
> > > > > >>>>>>> using it (at least nobody is contributing to it anymore).
> > > > > >>>>>>>
> > > > > >>>>>>>
> > > > > >>>>>>> If anybody out there is still using 0.94 and is not
> planning
> > to
> > > > > >>>> upgrade
> > > > > >>>>>> to
> > > > > >>>>>>> 0.98 or later soon (which will required downtime), please
> > speak
> > > > > >> up.
> > > > > >>>>>>> Otherwise it might be time to think about EOL'ing 0.94.
> > > > > >>>>>>>
> > > > > >>>>>>> It's not actually much work to do these releases,
> especially
> > > when
> > > > > >>>> they
> > > > > >>>>>> are
> > > > > >>>>>>> so small, but I'd like to continue only if they are
> actually
> > > > > >> used.
> > > > > >>>>>>> In any case, I am going to spin 0.94.26 with the current 4
> > > fixes
> > > > > >>>> today
> > > > > >>>>> or
> > > > > >>>>>>> tomorrow.
> > > > > >>>>>>>
> > > > > >>>>>>> -- Lars
> > > > > >>>>>>>
> > > > > >>>>>>
> > > > > >>>>>
> > > > > >>>>
> > > > > >>>
> > > > > >>>
> > > > > >>> --
> > > > > >>> Best regards,
> > > > > >>>
> > > > > >>>     - Andy
> > > > > >>>
> > > > > >>> Problems worthy of attack prove their worth by hitting back. -
> > Piet
> > > > > Hein
> > > > > >>> (via Tom White)
> > > > > >>>
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > Best regards,
> > >
> > >    - Andy
> > >
> > > Problems worthy of attack prove their worth by hitting back. - Piet
> Hein
> > > (via Tom White)
> > >
> >
>

Re: 0.94 going forward

Reply via email to