So what about a standalone service other than master? You can use your own procedure store in that service?
2016-09-23 11:28 GMT+08:00 Ted Yu <yuzhih...@gmail.com>: > An earlier implementation was client driven. > > But with that approach, it is hard to resume if there is error midway. > Using Procedure V2 makes the backup / restore more robust. > > Another consideration is for security. It is hard to enforce security (to > be implemented) for client driven actions. > > Cheers > > > On Sep 22, 2016, at 8:15 PM, Andrew Purtell <andrew.purt...@gmail.com> > wrote: > > > > No, this misses Matteo's finer point, which is "shelling out" from the > master directly to run MR is a first. Why not drive this with a utility > derived from Tool? > > > > On Sep 22, 2016, at 7:57 PM, Vladimir Rodionov <vladrodio...@gmail.com> > wrote: > > > >>>> In our production cluster, it is a common case we just have HDFS and > >>>> HBase deployed. > >>>> If our Master/RS depend on MR framework (especially some features we > >>>> have not used at all), it introduced another cost for maintain. I > >>>> don't think it is a good idea. > >> > >> So , you are not backup users in this case. Many our customers have full > >> stack deployed and > >> want see backup to be a standard feature. Besides this, nothing will > happen > >> in your cluster > >> if you won't be doing backups. > >> > >> This discussion (we do not want see M/R dependency) goes to nowhere. We > >> asked already, at least twice, to suggest another framework (other than > M/R) > >> for bulk data copy with *conversion*. Still waiting for suggestions. > >> > >> -Vlad > >> > >> > >> > >> > >>> On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <yuzhih...@gmail.com> wrote: > >>> > >>> If MR framework is not deployed in the cluster, hbase still functions > >>> normally (post merge). > >>> > >>> In terms of build time dependency, we have long been depending on > >>> mapreduce. Take a look at ExportSnapshot. > >>> > >>> Cheers > >>> > >>> On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <heng.chen.1...@gmail.com> > >>> wrote: > >>> > >>>> In our production cluster, it is a common case we just have HDFS and > >>>> HBase deployed. > >>>> If our Master/RS depend on MR framework (especially some features we > >>>> have not used at all), it introduced another cost for maintain. I > >>>> don't think it is a good idea. > >>>> > >>>> 2016-09-23 10:28 GMT+08:00 张铎 <palomino...@gmail.com>: > >>>>> To be specific, for example, our nice Backup/Restore feature, if we > >>> think > >>>>> this is not a core feature of HBase, then we could make it depend on > >>> MR, > >>>>> and start a standalone BackupManager instance that submits MR jobs to > >>> do > >>>>> periodical maintenance job. And if we think this is a core feature > that > >>>>> everyone should use it, then we'd better implement it without MR > >>>>> dependency, like DLS. > >>>>> > >>>>> Thanks. > >>>>> > >>>>> 2016-09-23 10:11 GMT+08:00 张铎 <palomino...@gmail.com>: > >>>>> > >>>>>> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our > >>>>>> features depend on MR but I think the bottom line is that we should > >>>> launch > >>>>>> the jobs from outside manually or by other services. > >>>>>> > >>>>>> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <andrew.purt...@gmail.com > >: > >>>>>> > >>>>>>> Ok, got it. Well "shelling out" is on the line I think, so a fair > >>>>>>> question. > >>>>>>> > >>>>>>> Can this be driven by a utility derived from Tool like our other MR > >>>> apps? > >>>>>>> The issue is needing the AccessController to decide if allowed? But > >>>> nothing > >>>>>>> prevents the user from running the job manually/independently, > right? > >>>>>>> > >>>>>>>> On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi < > >>>> theo.berto...@gmail.com> > >>>>>>> wrote: > >>>>>>>> > >>>>>>>> just a remark. my query was not about tools using MR (everyone i > >>>> think > >>>>>>> is > >>>>>>>> ok with those). > >>>>>>>> the topic was about: "are we ok with running MR jobs from Master > >>> and > >>>> RSs > >>>>>>>> code?" since this will be the first time we do this > >>>>>>>> > >>>>>>>> Matteo > >>>>>>>> > >>>>>>>> > >>>>>>>>> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das < > >>> d...@hortonworks.com> > >>>>>>> wrote: > >>>>>>>>> > >>>>>>>>> Very much agree; for tools like ExportSnapshot / Backup / > Restore, > >>>> it's > >>>>>>>>> fine to be dependent on MR. MR is the right framework for such. > We > >>>>>>> should > >>>>>>>>> also do compactions using MR (just saying :) ) > >>>>>>>>> ________________________________________ > >>>>>>>>> From: Ted Yu <yuzhih...@gmail.com> > >>>>>>>>> Sent: Thursday, September 22, 2016 2:00 PM > >>>>>>>>> To: dev@hbase.apache.org > >>>>>>>>> Subject: Re: [DISCUSSION] MR jobs started by Master or RS > >>>>>>>>> > >>>>>>>>> I agree - backup / restore is in the same category as import / > >>>> export. > >>>>>>>>> > >>>>>>>>> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell < > >>>>>>> andrew.purt...@gmail.com> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> Backup is extra tooling around core in my opinion. Like import > or > >>>>>>> export. > >>>>>>>>>> Or the optional MOB tool. It's fine. > >>>>>>>>>> > >>>>>>>>>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi < > >>>> mberto...@apache.org> > >>>>>>>>>> wrote: > >>>>>>>>>>> > >>>>>>>>>>> What's the latest opinion around running MR jobs from hbase > >>>> (Master > >>>>>>> or > >>>>>>>>>> RS)? > >>>>>>>>>>> > >>>>>>>>>>> I remember in the past that there was discussion about not > >>> having > >>>> MR > >>>>>>>>> has > >>>>>>>>>>> direct dependency of hbase. > >>>>>>>>>>> > >>>>>>>>>>> I think some of discussion where around MOB that had a MR job > to > >>>>>>>>> compact, > >>>>>>>>>>> that later was transformed in a non-MR job to be merged, I > think > >>>> we > >>>>>>>>> had a > >>>>>>>>>>> similar discussion for log split/replay. > >>>>>>>>>>> > >>>>>>>>>>> the latest is the new Backup feature (HBASE-7912), that runs a > >>> MR > >>>> job > >>>>>>>>>> from > >>>>>>>>>>> the master to copy data or restore data. > >>>>>>>>>>> (backup is also "not really core" as in.. if you don't use > >>> backup > >>>>>>>>> you'll > >>>>>>>>>>> not end up running MR jobs, but this was probably true for MOB > >>> as > >>>> in > >>>>>>>>> "if > >>>>>>>>>>> you don't enable MOB you don't need MR") > >>>>>>>>>>> > >>>>>>>>>>> any thoughts? do we a rule that says "we don't want to have > >>> hbase > >>>> run > >>>>>>>>> MR > >>>>>>>>>>> jobs, only tool started manually by the user can do that". or > >>> can > >>>> we > >>>>>>>>>> start > >>>>>>>>>>> adding MR calls around without problems? > >>> >