>> In our production cluster, it is a common case we just have HDFS and >> HBase deployed. >> If our Master/RS depend on MR framework (especially some features we >> have not used at all), it introduced another cost for maintain. I >> don't think it is a good idea.
So , you are not backup users in this case. Many our customers have full stack deployed and want see backup to be a standard feature. Besides this, nothing will happen in your cluster if you won't be doing backups. This discussion (we do not want see M/R dependency) goes to nowhere. We asked already, at least twice, to suggest another framework (other than M/R) for bulk data copy with *conversion*. Still waiting for suggestions. -Vlad On Thu, Sep 22, 2016 at 7:49 PM, Ted Yu <[email protected]> wrote: > If MR framework is not deployed in the cluster, hbase still functions > normally (post merge). > > In terms of build time dependency, we have long been depending on > mapreduce. Take a look at ExportSnapshot. > > Cheers > > On Thu, Sep 22, 2016 at 7:42 PM, Heng Chen <[email protected]> > wrote: > > > In our production cluster, it is a common case we just have HDFS and > > HBase deployed. > > If our Master/RS depend on MR framework (especially some features we > > have not used at all), it introduced another cost for maintain. I > > don't think it is a good idea. > > > > 2016-09-23 10:28 GMT+08:00 张铎 <[email protected]>: > > > To be specific, for example, our nice Backup/Restore feature, if we > think > > > this is not a core feature of HBase, then we could make it depend on > MR, > > > and start a standalone BackupManager instance that submits MR jobs to > do > > > periodical maintenance job. And if we think this is a core feature that > > > everyone should use it, then we'd better implement it without MR > > > dependency, like DLS. > > > > > > Thanks. > > > > > > 2016-09-23 10:11 GMT+08:00 张铎 <[email protected]>: > > > > > >> I‘m -1 on let master or rs launch MR jobs. It is OK that some of our > > >> features depend on MR but I think the bottom line is that we should > > launch > > >> the jobs from outside manually or by other services. > > >> > > >> 2016-09-23 9:47 GMT+08:00 Andrew Purtell <[email protected]>: > > >> > > >>> Ok, got it. Well "shelling out" is on the line I think, so a fair > > >>> question. > > >>> > > >>> Can this be driven by a utility derived from Tool like our other MR > > apps? > > >>> The issue is needing the AccessController to decide if allowed? But > > nothing > > >>> prevents the user from running the job manually/independently, right? > > >>> > > >>> > On Sep 22, 2016, at 3:44 PM, Matteo Bertozzi < > > [email protected]> > > >>> wrote: > > >>> > > > >>> > just a remark. my query was not about tools using MR (everyone i > > think > > >>> is > > >>> > ok with those). > > >>> > the topic was about: "are we ok with running MR jobs from Master > and > > RSs > > >>> > code?" since this will be the first time we do this > > >>> > > > >>> > Matteo > > >>> > > > >>> > > > >>> >> On Thu, Sep 22, 2016 at 2:49 PM, Devaraj Das < > [email protected]> > > >>> wrote: > > >>> >> > > >>> >> Very much agree; for tools like ExportSnapshot / Backup / Restore, > > it's > > >>> >> fine to be dependent on MR. MR is the right framework for such. We > > >>> should > > >>> >> also do compactions using MR (just saying :) ) > > >>> >> ________________________________________ > > >>> >> From: Ted Yu <[email protected]> > > >>> >> Sent: Thursday, September 22, 2016 2:00 PM > > >>> >> To: [email protected] > > >>> >> Subject: Re: [DISCUSSION] MR jobs started by Master or RS > > >>> >> > > >>> >> I agree - backup / restore is in the same category as import / > > export. > > >>> >> > > >>> >> On Thu, Sep 22, 2016 at 1:58 PM, Andrew Purtell < > > >>> [email protected]> > > >>> >> wrote: > > >>> >> > > >>> >>> Backup is extra tooling around core in my opinion. Like import or > > >>> export. > > >>> >>> Or the optional MOB tool. It's fine. > > >>> >>> > > >>> >>>> On Sep 22, 2016, at 1:50 PM, Matteo Bertozzi < > > [email protected]> > > >>> >>> wrote: > > >>> >>>> > > >>> >>>> What's the latest opinion around running MR jobs from hbase > > (Master > > >>> or > > >>> >>> RS)? > > >>> >>>> > > >>> >>>> I remember in the past that there was discussion about not > having > > MR > > >>> >> has > > >>> >>>> direct dependency of hbase. > > >>> >>>> > > >>> >>>> I think some of discussion where around MOB that had a MR job to > > >>> >> compact, > > >>> >>>> that later was transformed in a non-MR job to be merged, I think > > we > > >>> >> had a > > >>> >>>> similar discussion for log split/replay. > > >>> >>>> > > >>> >>>> the latest is the new Backup feature (HBASE-7912), that runs a > MR > > job > > >>> >>> from > > >>> >>>> the master to copy data or restore data. > > >>> >>>> (backup is also "not really core" as in.. if you don't use > backup > > >>> >> you'll > > >>> >>>> not end up running MR jobs, but this was probably true for MOB > as > > in > > >>> >> "if > > >>> >>>> you don't enable MOB you don't need MR") > > >>> >>>> > > >>> >>>> any thoughts? do we a rule that says "we don't want to have > hbase > > run > > >>> >> MR > > >>> >>>> jobs, only tool started manually by the user can do that". or > can > > we > > >>> >>> start > > >>> >>>> adding MR calls around without problems? > > >>> >>> > > >>> >> > > >>> > > >> > > >> > > >
