So what about a standalone service other than master? You can use your own
procedure store in that service?

2016-09-23 11:28 GMT+08:00 Ted Yu <yuzhih...@gmail.com>:

> An earlier implementation was client driven.
>
> But with that approach, it is hard to resume if there is error midway.
> Using Procedure V2 makes the backup / restore more robust.
>
> Another consideration is for security. It is hard to enforce security (to
> be implemented) for client driven actions.
>
> Cheers
>
> > On Sep 22, 2016, at 8:15 PM, Andrew Purtell <andrew.purt...@gmail.com>
> wrote:
> >
> > No, this misses Matteo's finer point, which is "shelling out" from the
> master directly to run MR is a first. Why not drive this with a utility
> derived from Tool?
> >
> > On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov <vladrodio...@gmail.com>
> wrote:
> >
> >>>> In our production cluster,  it is a common case we just have HDFS and
> >>>> HBase deployed.
> >>>> If our Master/RS depend on MR framework (especially some features we
> >>>> have not used at all),  it introduced another cost for maintain.  I
> >>>> don't think it is a good idea.
> >>
> >> So , you are not backup users in this case. Many our customers have full
> >> stack deployed and
> >> want see backup to be a standard feature. Besides this, nothing will
> happen
> >> in your cluster
> >> if you won't be doing backups.
> >>
> >> This discussion (we do not want see M/R dependency) goes to nowhere. We
> >> asked already, at least twice, to suggest another framework (other than
> M/R)
> >> for bulk data copy with *conversion*. Still waiting for suggestions.
> >>
> >> -Vlad
> >>
> >>
> >>
> >>
> >>> On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> >>>
> >>> If MR framework is not deployed in the cluster, hbase still functions
> >>> normally (post merge).
> >>>
> >>> In terms of build time dependency, we have long been depending on
> >>> mapreduce. Take a look at ExportSnapshot.
> >>>
> >>> Cheers
> >>>
> >>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <heng.chen.1...@gmail.com>
> >>> wrote:
> >>>
> >>>> In our production cluster,  it is a common case we just have HDFS and
> >>>> HBase deployed.
> >>>> If our Master/RS depend on MR framework (especially some features we
> >>>> have not used at all),  it introduced another cost for maintain.  I
> >>>> don't think it is a good idea.
> >>>>
> >>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>:
> >>>>> To be specific, for example, our nice Backup/Restore feature, if we
> >>> think
> >>>>> this is not a core feature of HBase, then we could make it depend on
> >>> MR,
> >>>>> and start a standalone BackupManager instance that submits MR jobs to
> >>> do
> >>>>> periodical maintenance job. And if we think this is a core feature
> that
> >>>>> everyone should use it, then we'd better implement it without MR
> >>>>> dependency, like DLS.
> >>>>>
> >>>>> Thanks.
> >>>>>
> >>>>> 2016-09-23 10:11 GMT+08:00 张铎 <palomino...@gmail.com>:
> >>>>>
> >>>>>> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our
> >>>>>> features depend on MR but I think the bottom line is that we should
> >>>> launch
> >>>>>> the jobs from outside manually or by other services.
> >>>>>>
> >>>>>> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <andrew.purt...@gmail.com
> >:
> >>>>>>
> >>>>>>> Ok, got it. Well "shelling out" is on the line I think, so a fair
> >>>>>>> question.
> >>>>>>>
> >>>>>>> Can this be driven by a utility derived from Tool like our other MR
> >>>> apps?
> >>>>>>> The issue is needing the AccessController to decide if allowed? But
> >>>> nothing
> >>>>>>> prevents the user from running the job manually/independently,
> right?
> >>>>>>>
> >>>>>>>> On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi <
> >>>> theo.berto...@gmail.com>
> >>>>>>> wrote:
> >>>>>>>>
> >>>>>>>> just a remark. my query was not about tools using MR (everyone i
> >>>> think
> >>>>>>> is
> >>>>>>>> ok with those).
> >>>>>>>> the topic was about: "are we ok with running MR jobs from Master
> >>> and
> >>>> RSs
> >>>>>>>> code?" since this will be the first time we do this
> >>>>>>>>
> >>>>>>>> Matteo
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das <
> >>> d...@hortonworks.com>
> >>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>> Very much agree; for tools like ExportSnapshot / Backup /
> Restore,
> >>>> it's
> >>>>>>>>> fine to be dependent on MR. MR is the right framework for such.
> We
> >>>>>>> should
> >>>>>>>>> also do compactions using MR (just saying :) )
> >>>>>>>>> ________________________________________
> >>>>>>>>> From: Ted Yu <yuzhih...@gmail.com>
> >>>>>>>>> Sent: Thursday, September 22, 2016 2:00 PM
> >>>>>>>>> To: dev@hbase.apache.org
> >>>>>>>>> Subject: Re: [DISCUSSION] MR jobs started by Master or RS
> >>>>>>>>>
> >>>>>>>>> I agree - backup / restore is in the same category as import /
> >>>> export.
> >>>>>>>>>
> >>>>>>>>> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell <
> >>>>>>> andrew.purt...@gmail.com>
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Backup is extra tooling around core in my opinion. Like import
> or
> >>>>>>> export.
> >>>>>>>>>> Or the optional MOB tool. It's fine.
> >>>>>>>>>>
> >>>>>>>>>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi <
> >>>> mberto...@apache.org>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> What's the latest opinion around running MR jobs from hbase
> >>>> (Master
> >>>>>>> or
> >>>>>>>>>> RS)?
> >>>>>>>>>>>
> >>>>>>>>>>> I remember in the past that there was discussion about not
> >>> having
> >>>> MR
> >>>>>>>>> has
> >>>>>>>>>>> direct dependency of hbase.
> >>>>>>>>>>>
> >>>>>>>>>>> I think some of discussion where around MOB that had a MR job
> to
> >>>>>>>>> compact,
> >>>>>>>>>>> that later was transformed in a non-MR job to be merged, I
> think
> >>>> we
> >>>>>>>>> had a
> >>>>>>>>>>> similar discussion for log split/replay.
> >>>>>>>>>>>
> >>>>>>>>>>> the latest is the new Backup feature (HBASE-7912), that runs a
> >>> MR
> >>>> job
> >>>>>>>>>> from
> >>>>>>>>>>> the master to copy data or restore data.
> >>>>>>>>>>> (backup is also "not really core" as in.. if you don't use
> >>> backup
> >>>>>>>>> you'll
> >>>>>>>>>>> not end up running MR jobs, but this was probably true for MOB
> >>> as
> >>>> in
> >>>>>>>>> "if
> >>>>>>>>>>> you don't enable MOB you don't need MR")
> >>>>>>>>>>>
> >>>>>>>>>>> any thoughts? do we a rule that says "we don't want to have
> >>> hbase
> >>>> run
> >>>>>>>>> MR
> >>>>>>>>>>> jobs, only tool started manually by the user can do that". or
> >>> can
> >>>> we
> >>>>>>>>>> start
> >>>>>>>>>>> adding MR calls around without problems?
> >>>
>

Reply via email to