Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-10-05 Thread Ted Yu
patch for review... > > > From: Ted Yu > Sent: Tuesday, October 04, 2016 6:28 PM > To: dev@hbase.apache.org > Subject: Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started > by Master or RS) > > Refactoring work over in HBASE-16727 is read

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-10-05 Thread Devaraj Das
(WAS => Re: [DISCUSSION] MR jobs started by Master or RS) Refactoring work over in HBASE-16727 is ready for review. Kindly provide your feedback. Thanks On Mon, Oct 3, 2016 at 3:05 PM, Andrew Purtell wrote: > This sounds good to me. > I'd be at least +0 as to merging the branc

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-10-04 Thread Ted Yu
take a backup/restore his/her tables), we can discuss the > "backup > > service" or something else. > > Folks - Stack / Andrew / Matteo / others, please speak up if you disagree > > with the above. Would like to get over this merge-to-master hump > obviously

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-10-03 Thread Andrew Purtell
ionov > Sent: Monday, September 26, 2016 11:48 AM > To: dev@hbase.apache.org > Subject: Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs > started by Master or RS) > > Ok, we had internal discussion and this is what we are suggesting now: > > 1. We will create separ

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-27 Thread Matteo Bertozzi
; Sent: Monday, September 26, 2016 11:48 AM > To: dev@hbase.apache.org > Subject: Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs > started by Master or RS) > > Ok, we had internal discussion and this is what we are suggesting now: > > 1. We will create separate module

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-26 Thread Devaraj Das
apache.org Subject: Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS) Ok, we had internal discussion and this is what we are suggesting now: 1. We will create separate module (hbase-backup) and move server-side code there. 2. Master and RS will be MR and backup free

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-26 Thread Vladimir Rodionov
Ok, we had internal discussion and this is what we are suggesting now: 1. We will create separate module (hbase-backup) and move server-side code there. 2. Master and RS will be MR and backup free. 3. The code from Master will be moved into standalone service (BackupService) for procedure orchestr

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Stack
On Sat, Sep 24, 2016 at 9:58 AM, Andrew Purtell wrote: > At branch merge voting time now more eyes are getting on the design issues > with dissenting opinion emerging. This is the branch merge process working > as our community has designed it. Because this is the first full project > review of t

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
How would security be enforce on client side ? On Sat, Sep 24, 2016 at 12:21 PM, Vladimir Rodionov wrote: > >> The standalone service so far > > 1, 2, 3 can be done in client side as well. Are you going to implement HA > for the service? If not, service can fail and will require clean up/repair

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Vladimir Rodionov
>> The standalone service so far 1, 2, 3 can be done in client side as well. Are you going to implement HA for the service? If not, service can fail and will require clean up/repair on restart. The same can be done with a client - side tool (in repair mode) -1 for the separate service. KISS rule

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
The standalone service so far seems to be middle ground having the following advantages: 1. utilization of existing proc V2 framework for fault tolerance 2. friendliness to security support to be implemented in the next phase - security is hard to enforce from client side 3. not introducing MR cal

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Andrew Purtell
In my own HBase garden I would say HDFS is an obvious requirement but something we'd like to have an alternative for. We don't have much luck with the HDFS community and some others have patches waiting on six years or so. Adding YARN/MR to the list of must haves would be unwarranted and unneces

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Andrew Purtell
I don't see you prevailing with this line of argument but you are welcome to try. Don't shoot the messenger please. On Sep 24, 2016, at 11:08 AM, Vladimir Rodionov wrote: >>> The key takeaway seems to be don't call out to an external framework we > don't own from master (or regionserver) code.

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Vladimir Rodionov
>> So the standalone service would run out of proc - in the same vein as REST or thrift server. Ted, running separate process/service to coordinate backups is not a good idea. We have already a lot of them. On Sat, Sep 24, 2016 at 11:20 AM, Ted Yu wrote: > bq. don't call out to an external fram

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Vladimir Rodionov
>> HBase is a founding partner of a Hadoop stack: HDFS, MapReduce, HBase and should stay in Hadoop stack (with HDFS and Yarn/MapReduce). The world (of NoSQL) outside of Hadoop is scary (C* is probably the least scariest of all). I personally do not mind code refactoring and moving everything from

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
bq. don't call out to an external framework we don't own from master (or regionserver) code So the standalone service would run out of proc - in the same vein as REST or thrift server. Cheers On Sat, Sep 24, 2016 at 10:40 AM, Andrew Purtell wrote: > I was attempting to summarize Ted. > > A new

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Vladimir Rodionov
>> The key takeaway seems to be don't call out to an external framework we don't own from master (or regionserver) code. Should we ban HDFS as well? HBase is a founding partner of a Hadoop stack: HDFS, MapReduce, HBase -Vlad On Sat, Sep 24, 2016 at 10:40 AM, Andrew Purtell wrote: > I was attem

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Andrew Purtell
I was attempting to summarize Ted. A new maven module sounds like a good idea to me. Or we could move all the tools that use MR out to one. Or... The key takeaway seems to be don't call out to an external framework we don't own from master (or regionserver) code. > On Sep 24, 2016, at 10:15

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
bq. Internally the tool can also use the procedure framework for state durability Isn't this the standalone service I proposed this morning ? bq. Move cross HBase and MR coordination to a separate tool Where should this tool live (hbase-backup module) ? Thanks On Sat, Sep 24, 2016 at 9:58 AM,

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Andrew Purtell
At branch merge voting time now more eyes are getting on the design issues with dissenting opinion emerging. This is the branch merge process working as our community has designed it. Because this is the first full project review of the code and implementation I think we all have to be flexible.

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
bq. procedure gives you a retry mechanism on failure We do need this mechanism. Take a look at the multi-step in FullTableBackupProcedure, etc. bq. let the user export it later when he wants This would make supporting security more complex (user A shouldn't be exporting user B's backup). And it

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Matteo Bertozzi
On Sat, Sep 24, 2016 at 7:19 AM, Ted Yu wrote: > Ideally the export should have one job running which does the retry (on > failed partition) itself. > procedure gives you a retry mechanism on failure. if you don't use that, than you don't need procedure. if you want you can start a procedure exe

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
bq. run the tool over and over As Vlad mentioned earlier, potentially TB of data is involved. Repeatedly running the tool is not friendly to network. Ideally the export should have one job running which does the retry (on failed partition) itself. HFileSplitter is another class which depends on m

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Matteo Bertozzi
as far as I understand the code, you don't need procedure for the export itself. the export operation is already idempotent, since you are just copying files. if the file exist and is complete (check length, checksum, ...) you can skip it, otherwise you'll send it over again. you need the proc for

Re: Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-24 Thread Ted Yu
Master is involved in this discussion because currently only Master instantiates ProcedureExecutor which runs the 3 Procedures for backup / restore. What if an optional standalone service which hosts ProcedureExecutor is used for this purpose ? Would that have better chance of giving us middle gro

Backup Implementation (WAS => Re: [DISCUSSION] MR jobs started by Master or RS)

2016-09-23 Thread Stack
(Moved out of the Master doing MR DISCUSSION) On Fri, Sep 23, 2016 at 12:24 PM, Vladimir Rodionov wrote: > >> -1 on that backup be in core hbase > > Not sure I understand what it means. > > Sorry for the imprecision. The -1 is NOT against backup/restore. I am -1 on MR as a dependency and so -1