[jira] [Commented] (HBASE-27238) Backport Backup/Restore to 2.x

2023-01-23 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-27238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680071#comment-17680071
 ] 

Mallikarjun commented on HBASE-27238:
-

[~bbeaudreault]  Thanks for taking out time to review this Patch.

> Backport Backup/Restore to 2.x
> --
>
> Key: HBASE-27238
> URL: https://issues.apache.org/jira/browse/HBASE-27238
> Project: HBase
>  Issue Type: New Feature
>  Components: backport, backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 2.6.0
>
>
> Backport backup/restore to 2.x branch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (HBASE-27582) Errorprone cleanup in hbase-backup

2023-01-20 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun reassigned HBASE-27582:
---

Assignee: Mallikarjun

> Errorprone cleanup in hbase-backup
> --
>
> Key: HBASE-27582
> URL: https://issues.apache.org/jira/browse/HBASE-27582
> Project: HBase
>  Issue Type: Task
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Minor
>
> I noticed a bunch of javac warnings in backporting the backups feature to 
> branch-2. The same problems exist in master branch. Let's cleanup error prone 
> warnings in both branches once the backport lands.
> See 
> [https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4770/10/artifact/yetus-general-check/output/diff-compile-javac-root.txt]
>  for initial set to fix. Mostly in tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HBASE-27582) Errorprone cleanup in hbase-backup

2023-01-20 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-27582 started by Mallikarjun.
---
> Errorprone cleanup in hbase-backup
> --
>
> Key: HBASE-27582
> URL: https://issues.apache.org/jira/browse/HBASE-27582
> Project: HBase
>  Issue Type: Task
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Minor
>
> I noticed a bunch of javac warnings in backporting the backups feature to 
> branch-2. The same problems exist in master branch. Let's cleanup error prone 
> warnings in both branches once the backport lands.
> See 
> [https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-4770/10/artifact/yetus-general-check/output/diff-compile-javac-root.txt]
>  for initial set to fix. Mostly in tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Work started] (HBASE-27238) Backport Backup/Restore to 2.x

2022-07-25 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-27238 started by Mallikarjun.
---
> Backport Backup/Restore to 2.x
> --
>
> Key: HBASE-27238
> URL: https://issues.apache.org/jira/browse/HBASE-27238
> Project: HBase
>  Issue Type: New Feature
>  Components: backport, backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Backport backup/restore to 2.x branch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (HBASE-27238) Backport Backup/Restore to 2.x

2022-07-25 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-27238?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-27238:

Description: Backport backup/restore to 2.x branch. 

> Backport Backup/Restore to 2.x
> --
>
> Key: HBASE-27238
> URL: https://issues.apache.org/jira/browse/HBASE-27238
> Project: HBase
>  Issue Type: New Feature
>  Components: backport, backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Backport backup/restore to 2.x branch. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (HBASE-27238) Backport Backup/Restore to 2.x

2022-07-25 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-27238:
---

 Summary: Backport Backup/Restore to 2.x
 Key: HBASE-27238
 URL: https://issues.apache.org/jira/browse/HBASE-27238
 Project: HBase
  Issue Type: New Feature
  Components: backport, backuprestore
Reporter: Mallikarjun
Assignee: Mallikarjun






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-26 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558902#comment-17558902
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/26/22 4:11 PM:
--

#1. This is something I did not think of. Even now I don't see if there is a 
way to solve this problem. Do you have any suggestions?

#2. Even during backup rsgroup is considered and taken backup, #1 was about 
this. Sorry, can you elaborate on what you are confused about?  


was (Author: rda3mon):
#1. This is something I did not think of. Even now I don't see if there is a 
way to solve this problem. Do you have any suggestions?

#2. Even during backup rsgroup is considered and taken backup. This was this 
#1. Sorry, can you elaborate on what you are confused about?  

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-26322) Add rsgroup support for Backup

2022-06-26 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558902#comment-17558902
 ] 

Mallikarjun commented on HBASE-26322:
-

#1. This is something I did not think of. Even now I don't see if there is a 
way to solve this problem. Do you have any suggestions?

#2. Even during backup rsgroup is considered and taken backup. This was this 
#1. Sorry, can you elaborate on what you are confused about?  

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26034) Add support to take parallel backups

2022-06-26 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558900#comment-17558900
 ] 

Mallikarjun edited comment on HBASE-26034 at 6/26/22 3:55 PM:
--

That part I did not solve for. Each backup would take the all WAL files of 
those regionservers resulting into more data than what is necessary. This 
problem exists only for Incremental backups since they depend on WAL files.  


was (Author: rda3mon):
That part I could not solve for. Each backup would take the all WAL files of 
those regionservers resulting into more data than what is necessary. This 
problem exists only for Incremental backups since they depend on WAL files.  

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-26034) Add support to take parallel backups

2022-06-26 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17558900#comment-17558900
 ] 

Mallikarjun commented on HBASE-26034:
-

That part I could not solve for. Each backup would take the all WAL files of 
those regionservers resulting into more data than what is necessary. This 
problem exists only for Incremental backups since they depend on WAL files.  

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26034) Add support to take parallel backups

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555619#comment-17555619
 ] 

Mallikarjun edited comment on HBASE-26034 at 6/17/22 1:44 PM:
--

In existing implementation, one can take single backup at a time. Which takes 
exclusive system wide lock and resulting into some problems, esp if you have 
rsgroup enabled with multiple tenants wanting to take backups with different 
intervals. 

Following are the list of changes in the PR.
 # Remove exclusive system wide lock and replace with table level locks with 
checkAndPut. Repairs in case of abruptly dead jobs. This helps in taking 
parallel table backup and configuring independent RPO
 # Take snapshot of backup table at the begining of backup and restore the 
snapshot at the end was unnecessary. This is removed as it serves no purpose 
and simplifying the logic.

These are the 2 changes. Because of posibility of multiple backups happening at 
any point of time. Had to change BackupId to List while handling 
sessions. These are the changes in this PR.

[~zhangduo] I have listed the changes above. If want any other information, 
please ask.

 

P.S: Thank you very much for taking time to look into this.


was (Author: rda3mon):
One can take single backup at a time. Which takes exclusive system wide lock 
and resulting into following problems, esp if you have rsgroup enabled with 
multiple tenants wanting to take backups with different intervals. 

Following are the list of changes in the PR.
 # Remove exclusive system wide lock and replace with table level locks with 
checkAndPut. Repairs in case of abruptly dead jobs. This helps in taking 
parallel backup and configuring independent RPO
 # Take snapshot of backup table at the begining of backup and restore the 
snapshot at the end was unnecessary. This is removed as it serves no purpose.

These are the 2 changes. Because of posibility of multiple backups happening at 
any point of time. Had to change BackupId to List while handling 
sessions. These are the changes in this PR.

[~zhangduo] I have listed the changes above. If want any other information, 
please ask.

 

P.S: Thank you very much for taking time to look into this.

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-26034) Add support to take parallel backups

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17555619#comment-17555619
 ] 

Mallikarjun commented on HBASE-26034:
-

One can take single backup at a time. Which takes exclusive system wide lock 
and resulting into following problems, esp if you have rsgroup enabled with 
multiple tenants wanting to take backups with different intervals. 

Following are the list of changes in the PR.
 # Remove exclusive system wide lock and replace with table level locks with 
checkAndPut. Repairs in case of abruptly dead jobs. This helps in taking 
parallel backup and configuring independent RPO
 # Take snapshot of backup table at the begining of backup and restore the 
snapshot at the end was unnecessary. This is removed as it serves no purpose.

These are the 2 changes. Because of posibility of multiple backups happening at 
any point of time. Had to change BackupId to List while handling 
sessions. These are the changes in this PR.

[~zhangduo] I have listed the changes above. If want any other information, 
please ask.

 

P.S: Thank you very much for taking time to look into this.

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-4
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/17/22 1:26 PM:
--

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and ever growing 
WAL's will fill up the disk easily (Also WAL's are not compressed resulting 
into faster disk fill up). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 


was (Author: rda3mon):
Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and ever growing 
WAL's will fill up the disk easily (Also WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/17/22 1:26 PM:
--

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and ever growing 
WAL's will fill up the disk easily (Also WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 


was (Author: rda3mon):
Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and ever growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/17/22 1:26 PM:
--

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and ever growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 


was (Author: rda3mon):
Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/17/22 1:25 PM:
--

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's and other problems. (Because wals are retained 
until next successful backup is completed)

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 


was (Author: rda3mon):
Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's, and other problems.

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun edited comment on HBASE-26322 at 6/17/22 1:24 PM:
--

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A regionservers are part of 
RsgroupA. rs1B, rs2B, rs3B regionserver are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's, and other problems.

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 


was (Author: rda3mon):
Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A are part of RsgroupA. rs1B, rs2B, 
rs3B are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's, and other problems.

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Commented] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1795#comment-1795
 ] 

Mallikarjun commented on HBASE-26322:
-

Backup currently doesn't understand rsgroups. Which results into 2 problems.

Say there are 2 rsgroups. RsgroupA, RsgroupB. tableA is part of RsgroupA and 
tableB is part of RsgroupB. rs1A, rs2A, rs3A are part of RsgroupA. rs1B, rs2B, 
rs3B are part of rsgroupB.

Problem 1:

When you enable backup on tableA, then only rs1A, rs2A, rs3A should participate 
in backup (WAL's of these regionservers are backed up). Since backup doesn't 
understand rsgroup, all regionservers participate in backup rs1A, rs2A, rs3A, 
rs1B, rs2B, rs3B. Which means, you need to plan for additional capacity 
requirement for additional WAL's, and other problems.

Problem 2: 
BackupLogCleaner also doesn't understand rsgroups with incremental backup 
enabled. This can result into a big problem. In the above example, say backup 
is configured for only TableA. Hence BackupLogCleaner cleans up WAL's of only 
rs1A, rs2A, rs3A once a backup is completed. WAL's of rs1B, rs2B, rs3B are 
never cleaned up because there is no table backup configured and every growing 
WAL's will fill up the disk easily (since WAL's are not compressed). 

[~zhangduo]  Hope this is enough details. Please ask anything you did not 
understand. 

 

P.S: Thank you very much for taking time to look into this. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Updated] (HBASE-26322) Add rsgroup support for Backup

2022-06-17 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26322:

Affects Version/s: 3.0.0-alpha-2
   (was: 3.0.0-alpha-1)

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-4
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Comment Edited] (HBASE-26322) Add rsgroup support for Backup

2021-12-02 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452402#comment-17452402
 ] 

Mallikarjun edited comment on HBASE-26322 at 12/2/21, 1:16 PM:
---

[~zhangduo] [~stack] [~anoop.hbase] Kindly help me with this review when you 
can make some free time. Thanks


was (Author: rda3mon):
[~zhangduo] [~stack] [~anoop.hbase] Kindly help me with this review when you 
can make some free time.

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-2
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26034) Add support to take parallel backups

2021-12-02 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452404#comment-17452404
 ] 

Mallikarjun commented on HBASE-26034:
-

[~zhangduo] [~stack] [~anoop.hbase] Kindly help me on this review when you can 
find some free time. 

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (HBASE-26034) Add support to take parallel backups

2021-12-02 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452404#comment-17452404
 ] 

Mallikarjun edited comment on HBASE-26034 at 12/2/21, 1:16 PM:
---

[~zhangduo] [~stack] [~anoop.hbase] Kindly help me on this review when you can 
find some free time. Thanks


was (Author: rda3mon):
[~zhangduo] [~stack] [~anoop.hbase] Kindly help me on this review when you can 
find some free time. 

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Commented] (HBASE-26322) Add rsgroup support for Backup

2021-12-02 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17452402#comment-17452402
 ] 

Mallikarjun commented on HBASE-26322:
-

[~zhangduo] [~stack] [~anoop.hbase] Kindly help me with this review when you 
can make some free time.

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-2
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.20.1#820001)


[jira] [Comment Edited] (HBASE-26034) Add support to take parallel backups

2021-10-19 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430482#comment-17430482
 ] 

Mallikarjun edited comment on HBASE-26034 at 10/19/21, 11:31 AM:
-

[~zhangduo] [~stack] [~anoop.hbase] Patch for this is ready for review. Please 
someone have a look when you have some time.


was (Author: rda3mon):
[~zhangduo] [~stack] [~anoop.hbase] Patch for this is ready for review. Please 
someone have a look when you have some time.
[|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13404591]

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26034) Add support to take parallel backups

2021-10-19 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17430482#comment-17430482
 ] 

Mallikarjun commented on HBASE-26034:
-

[~zhangduo] [~stack] [~anoop.hbase] Patch for this is ready for review. Please 
someone have a look when you have some time.
[|https://issues.apache.org/jira/secure/AddComment!default.jspa?id=13404591]

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26343) Extend RSGroup to support data isolation to achieve true multitenancy in Hbase

2021-10-12 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26343:

Description: 
RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct while performing balancer 
activity
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 

  was:
RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct to perform various 
balancing activities
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 


> Extend RSGroup to support data isolation to achieve true multitenancy in Hbase
> --
>
> Key: HBASE-26343
> URL: https://issues.apache.org/jira/browse/HBASE-26343
> Project: HBase
>  Issue Type: Umbrella
>  Components: rsgroup
>Reporter: Mallikarjun
>Priority: Major
>
> RSGroups currently only provide isolation on serving layer, but not on the 
> data layer. And there is a need for providing data isolation between rsgroups 
> to achieve true multitenancy in hbase leading to independently scale 
> individual rsgroups on need bases. Some of the aspects to be covered in this 
> umbrella project are 
>  # Provide data isolation between different RSGroups
>  # Add balancer support to understand this construct while performing 
> balancer activity
>  # Extend support on various ancillary services such as export snapshot, 
> cluster replication, etc 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26343) Extend RSGroup to support data isolation to achieve true multitenancy in Hbase

2021-10-12 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26343:

Description: 
RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct to perform various 
balancing activities
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 

  was:
RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct on various balancing 
activities
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 


> Extend RSGroup to support data isolation to achieve true multitenancy in Hbase
> --
>
> Key: HBASE-26343
> URL: https://issues.apache.org/jira/browse/HBASE-26343
> Project: HBase
>  Issue Type: Umbrella
>  Components: rsgroup
>Reporter: Mallikarjun
>Priority: Major
>
> RSGroups currently only provide isolation on serving layer, but not on the 
> data layer. And there is a need for providing data isolation between rsgroups 
> to achieve true multitenancy in hbase leading to independently scale 
> individual rsgroups on need bases. Some of the aspects to be covered in this 
> umbrella project are 
>  # Provide data isolation between different RSGroups
>  # Add balancer support to understand this construct to perform various 
> balancing activities
>  # Extend support on various ancillary services such as export snapshot, 
> cluster replication, etc 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26346) Design support for rsgroup data isolation

2021-10-11 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26346:

Description: Put down design for changes required to support rsgroup data 
isolation.    (was: TODO)

> Design support for rsgroup data isolation 
> --
>
> Key: HBASE-26346
> URL: https://issues.apache.org/jira/browse/HBASE-26346
> Project: HBase
>  Issue Type: New Feature
>  Components: rsgroup
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Put down design for changes required to support rsgroup data isolation.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-26346) Design support for rsgroup data isolation

2021-10-11 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-26346 started by Mallikarjun.
---
> Design support for rsgroup data isolation 
> --
>
> Key: HBASE-26346
> URL: https://issues.apache.org/jira/browse/HBASE-26346
> Project: HBase
>  Issue Type: New Feature
>  Components: rsgroup
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Put down design for changes required to support rsgroup data isolation.  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26346) Design support for rsgroup data isolation

2021-10-11 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26346:
---

 Summary: Design support for rsgroup data isolation 
 Key: HBASE-26346
 URL: https://issues.apache.org/jira/browse/HBASE-26346
 Project: HBase
  Issue Type: New Feature
  Components: rsgroup
Reporter: Mallikarjun
Assignee: Mallikarjun


TODO



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26343) Extend RSGroup to support data isolation to achieve true multitenancy in Hbase

2021-10-09 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26343:
---

 Summary: Extend RSGroup to support data isolation to achieve true 
multitenancy in Hbase
 Key: HBASE-26343
 URL: https://issues.apache.org/jira/browse/HBASE-26343
 Project: HBase
  Issue Type: Umbrella
  Components: rsgroup
Reporter: Mallikarjun


RSGroups currently only provide isolation on serving layer, but not on the data 
layer. And there is a need for providing data isolation between rsgroups to 
achieve true multitenancy in hbase leading to independently scale individual 
rsgroups on need bases. Some of the aspects to be covered in this umbrella 
project are 
 # Provide data isolation between different RSGroups
 # Add balancer support to understand this construct on various balancing 
activities
 # Extend support on various ancillary services such as export snapshot, 
cluster replication, etc 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-10-09 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Release Note: * Remove dependence on storing WAL filenames for backup in 
backup:system meta table

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> *Use Case 2.* 
> *Existing Design:* During incremental backup, to check system table if there 
> are any duplicate WAL's for which backup is taken again. 
> *New Design:*
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26322) Add rsgroup support for Backup

2021-10-06 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17424838#comment-17424838
 ] 

Mallikarjun commented on HBASE-26322:
-

[~zhangduo] [~stack] [~anoop.hbase] Patch for this is ready for review. Please 
someone have a look when you have some time. 

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-2
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-26322) Add rsgroup support for Backup

2021-10-02 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-26322 started by Mallikarjun.
---
> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-2
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26322) Add rsgroup support for Backup

2021-10-02 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26322:

Description: 
There are some places where backup needs some changes with respect to rsgroup. 
Some of them being addressed here are 
 # Incremental backup wal backup should happen only for servers which are part 
of a particular rsgroup under which namespace is configured for table backup 
under consideration
 # BackupLogCleaner should keep references only from those servers which are 
part of a particular rsgroup under which namesapce is configured for table 
backup under consideration

> Add rsgroup support for Backup
> --
>
> Key: HBASE-26322
> URL: https://issues.apache.org/jira/browse/HBASE-26322
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Minor
> Fix For: 3.0.0-alpha-2
>
>
> There are some places where backup needs some changes with respect to 
> rsgroup. Some of them being addressed here are 
>  # Incremental backup wal backup should happen only for servers which are 
> part of a particular rsgroup under which namespace is configured for table 
> backup under consideration
>  # BackupLogCleaner should keep references only from those servers which are 
> part of a particular rsgroup under which namesapce is configured for table 
> backup under consideration



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26322) Add rsgroup support for Backup

2021-10-02 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26322:
---

 Summary: Add rsgroup support for Backup
 Key: HBASE-26322
 URL: https://issues.apache.org/jira/browse/HBASE-26322
 Project: HBase
  Issue Type: Improvement
  Components: backuprestore
Affects Versions: 3.0.0-alpha-1
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-26301) Backport backup/restore to branch-2

2021-09-29 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422497#comment-17422497
 ] 

Mallikarjun edited comment on HBASE-26301 at 9/30/21, 2:10 AM:
---

[~bbeaudreault] I have a couple of features in pipeline for master branch. Post 
that will brackport to 2.x.  Assigned this ticket to myself. 


was (Author: rda3mon):
[~bbeaudreault] I have a couple of features in pipeline for master branch. Post 
that will brackport to 2.x.  Assigned this ticket to me. 

> Backport backup/restore to branch-2
> ---
>
> Key: HBASE-26301
> URL: https://issues.apache.org/jira/browse/HBASE-26301
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Major
>
> I was discussing this great feature with [~rda3mon] on Slack. His company is 
> using this on their fork of hbase 2.1. We're working on upgrading to 2.4 now, 
> and have our own home grown backup/restore system which is not as 
> sophisticated as the native solution. If this solution was backported to 
> branch-2, we would strongly consider adopting it as we finish up our upgrade.
> It looks like this was originally cut from 2.0 due to release timeline 
> pressures: https://issues.apache.org/jira/browse/HBASE-19407, and now suffers 
> from a lack of community support. This might make sense since it only exists 
> in 3.x, which is not yet released.
> It would be great to backport this to branch-2 so that it reach a wider 
> audience and adoption



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-26301) Backport backup/restore to branch-2

2021-09-29 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422497#comment-17422497
 ] 

Mallikarjun edited comment on HBASE-26301 at 9/30/21, 2:10 AM:
---

[~bbeaudreault] I have a couple of features in pipeline for master branch. Post 
that will brackport to 2.x.  Assigned this ticket to me. 


was (Author: rda3mon):
[~bbeaudreault] I have a couple of features in pipeline for master branch. Post 
that will brackport to 2.x. 

> Backport backup/restore to branch-2
> ---
>
> Key: HBASE-26301
> URL: https://issues.apache.org/jira/browse/HBASE-26301
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Major
>
> I was discussing this great feature with [~rda3mon] on Slack. His company is 
> using this on their fork of hbase 2.1. We're working on upgrading to 2.4 now, 
> and have our own home grown backup/restore system which is not as 
> sophisticated as the native solution. If this solution was backported to 
> branch-2, we would strongly consider adopting it as we finish up our upgrade.
> It looks like this was originally cut from 2.0 due to release timeline 
> pressures: https://issues.apache.org/jira/browse/HBASE-19407, and now suffers 
> from a lack of community support. This might make sense since it only exists 
> in 3.x, which is not yet released.
> It would be great to backport this to branch-2 so that it reach a wider 
> audience and adoption



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26301) Backport backup/restore to branch-2

2021-09-29 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17422497#comment-17422497
 ] 

Mallikarjun commented on HBASE-26301:
-

[~bbeaudreault] I have a couple of features in pipeline for master branch. Post 
that will brackport to 2.x. 

> Backport backup/restore to branch-2
> ---
>
> Key: HBASE-26301
> URL: https://issues.apache.org/jira/browse/HBASE-26301
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Major
>
> I was discussing this great feature with [~rda3mon] on Slack. His company is 
> using this on their fork of hbase 2.1. We're working on upgrading to 2.4 now, 
> and have our own home grown backup/restore system which is not as 
> sophisticated as the native solution. If this solution was backported to 
> branch-2, we would strongly consider adopting it as we finish up our upgrade.
> It looks like this was originally cut from 2.0 due to release timeline 
> pressures: https://issues.apache.org/jira/browse/HBASE-19407, and now suffers 
> from a lack of community support. This might make sense since it only exists 
> in 3.x, which is not yet released.
> It would be great to backport this to branch-2 so that it reach a wider 
> audience and adoption



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (HBASE-26301) Backport backup/restore to branch-2

2021-09-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun reassigned HBASE-26301:
---

Assignee: Mallikarjun

> Backport backup/restore to branch-2
> ---
>
> Key: HBASE-26301
> URL: https://issues.apache.org/jira/browse/HBASE-26301
> Project: HBase
>  Issue Type: New Feature
>Reporter: Bryan Beaudreault
>Assignee: Mallikarjun
>Priority: Major
>
> I was discussing this great feature with [~rda3mon] on Slack. His company is 
> using this on their fork of hbase 2.1. We're working on upgrading to 2.4 now, 
> and have our own home grown backup/restore system which is not as 
> sophisticated as the native solution. If this solution was backported to 
> branch-2, we would strongly consider adopting it as we finish up our upgrade.
> It looks like this was originally cut from 2.0 due to release timeline 
> pressures: https://issues.apache.org/jira/browse/HBASE-19407, and now suffers 
> from a lack of community support. This might make sense since it only exists 
> in 3.x, which is not yet released.
> It would be great to backport this to branch-2 so that it reach a wider 
> audience and adoption



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-09-13 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413942#comment-17413942
 ] 

Mallikarjun commented on HBASE-25891:
-

I was under impression it is somewhere else and not Jira :) 

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> *Use Case 2.* 
> *Existing Design:* During incremental backup, to check system table if there 
> are any duplicate WAL's for which backup is taken again. 
> *New Design:*
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-09-12 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413844#comment-17413844
 ] 

Mallikarjun edited comment on HBASE-25891 at 9/13/21, 1:57 AM:
---

Thanks [~zhangduo] [~stack] for reviewing the code.  Thanks [~anoop.hbase] for 
assisting on these changes. 

[~zhangduo] Where to fill release notes. any pointers?


was (Author: rda3mon):
Thanks [~zhangduo] [~stack] for reviewing the code.  Thanks [~anoop.hbase] for 
assisting on this PR. 

[~zhangduo] Where to fill release notes. any pointers?

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> *Use Case 2.* 
> *Existing Design:* During incremental backup, to check system table if there 
> are any duplicate WAL's for which backup is taken again. 
> *New Design:*
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-09-12 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17413844#comment-17413844
 ] 

Mallikarjun commented on HBASE-25891:
-

Thanks [~zhangduo] [~stack] for reviewing the code.  Thanks [~anoop.hbase] for 
assisting on this PR. 

[~zhangduo] Where to fill release notes. any pointers?

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> *Use Case 2.* 
> *Existing Design:* During incremental backup, to check system table if there 
> are any duplicate WAL's for which backup is taken again. 
> *New Design:*
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26279) Merger of backup:system table with hbase:meta table

2021-09-12 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26279:
---

 Summary: Merger of backup:system table with hbase:meta table
 Key: HBASE-26279
 URL: https://issues.apache.org/jira/browse/HBASE-26279
 Project: HBase
  Issue Type: Improvement
  Components: backuprestore
Reporter: Mallikarjun
Assignee: Mallikarjun


To Be filled



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-09-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Description: 
*Existing Design:*

!existing_design.png|width=632,height=1238!

*Proposed Changes:*

*!proposed_design.png|width=637,height=1300!*

  was:
*Existing Design:*

!existing_design.png|width=637,height=1248!

Changes:

!proposed_design.png|width=626,height=1277!


> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=632,height=1238!
> *Proposed Changes:*
> *!proposed_design.png|width=637,height=1300!*



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-09-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Attachment: proposed_design.png

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=637,height=1248!
> Changes:
> !proposed_design.png|width=626,height=1277!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-09-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Attachment: existing_design.png

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
> Attachments: existing_design.png
>
>
> *Existing Design:*
> !existing_design.png|width=637,height=1248!
> Changes:
> !proposed_design.png|width=626,height=1277!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-26034) Add support to take parallel backups

2021-09-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-26034 started by Mallikarjun.
---
> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> *Existing Design:*
> !existing_design.png|width=637,height=1248!
> Changes:
> !proposed_design.png|width=626,height=1277!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-09-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Description: 
*Existing Design:*

!existing_design.png|width=637,height=1248!

Changes:

!proposed_design.png|width=626,height=1277!

  was:Details to be filled.


> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> *Existing Design:*
> !existing_design.png|width=637,height=1248!
> Changes:
> !proposed_design.png|width=626,height=1277!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Description: 
*Existing Design*

  !existing_design.png|width=851,height=1667!

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*

!proposed_design.png|width=865,height=1766!

  was:
*Existing Design*

 

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: proposed_design.png

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design*
>  
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: existing_design.png

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
> Attachments: existing_design.png, proposed_design.png
>
>
> *Existing Design*
>  
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: (was: image-2021-06-03-16-34-34-957.png)

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
>
> *Existing Design*
>  
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: (was: image-2021-06-03-16-33-59-282.png)

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
>
> *Existing Design*
>  
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Description: 
*Existing Design*

 

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*

!image-2021-06-03-16-34-34-957.png|width=324,height=416!

  was:
*Existing Design*

!Backup Flow Chart.png|width=825,height=1617!

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: (was: Backup Flow Chart.png)

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
>
> *Existing Design*
>  
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: Backup Flow Chart.png

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
> Attachments: Backup Flow Chart.png, 
> image-2021-06-03-16-33-59-282.png, image-2021-06-03-16-34-34-957.png
>
>
> *Existing Design*
> *!image-2021-06-03-16-33-59-282.png|width=292,height=408!*
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> *Proposed Design.*
> !image-2021-06-03-16-34-34-957.png|width=324,height=416!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-08-29 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Description: 
*Existing Design*

!Backup Flow Chart.png|width=825,height=1617!

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*

!image-2021-06-03-16-34-34-957.png|width=324,height=416!

  was:
*Existing Design*

*!image-2021-06-03-16-33-59-282.png|width=292,height=408!*

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned 

[jira] [Updated] (HBASE-26203) Minor cleanups to reduce checkstyle warnings on backup code

2021-08-16 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26203:

Description: 
As suggested in this PR --> 
[https://github.com/apache/hbase/pull/3359#pullrequestreview-716511415]

Created this issue to clean up Backup classes to reduce checkstyle warnings.

  was:`WALProcedureStore` stands deprecated. Review its usage in Backup/Restore


> Minor cleanups to reduce checkstyle warnings on backup code
> ---
>
> Key: HBASE-26203
> URL: https://issues.apache.org/jira/browse/HBASE-26203
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Trivial
> Fix For: 3.0.0-alpha-2
>
>
> As suggested in this PR --> 
> [https://github.com/apache/hbase/pull/3359#pullrequestreview-716511415]
> Created this issue to clean up Backup classes to reduce checkstyle warnings.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26203) Minor cleanups to reduce checkstyle warnings on backup code

2021-08-16 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26203:
---

 Summary: Minor cleanups to reduce checkstyle warnings on backup 
code
 Key: HBASE-26203
 URL: https://issues.apache.org/jira/browse/HBASE-26203
 Project: HBase
  Issue Type: Improvement
  Components: backuprestore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2


`WALProcedureStore` stands deprecated. Review its usage in Backup/Restore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26202) Review deprecated WALProcedureStore usage in Backup

2021-08-16 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26202:

Description: `WALProcedureStore` stands deprecated. Review its usage in 
Backup/Restore

> Review deprecated WALProcedureStore usage in Backup
> ---
>
> Key: HBASE-26202
> URL: https://issues.apache.org/jira/browse/HBASE-26202
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Trivial
> Fix For: 3.0.0-alpha-2
>
>
> `WALProcedureStore` stands deprecated. Review its usage in Backup/Restore



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26202) Review deprecated WALProcedureStore usage in Backup

2021-08-16 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26202:
---

 Summary: Review deprecated WALProcedureStore usage in Backup
 Key: HBASE-26202
 URL: https://issues.apache.org/jira/browse/HBASE-26202
 Project: HBase
  Issue Type: Improvement
  Components: backuprestore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-26147) Add dry run mode to hbase balancer

2021-07-28 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389189#comment-17389189
 ] 

Mallikarjun edited comment on HBASE-26147 at 7/29/21, 2:41 AM:
---

[~bbeaudreault] This feature is going to be super useful. 

This comment is not directly on feature as such.
 # Command name could be inline with existing balancer. May be `balancer 
'dry_run'` or something similar to `force`. 
 # Rsgroup support should also be considered in this PR.


was (Author: rda3mon):
[~bbeaudreault] This comment is not directly on feature as such.
 # Command name could be inline with existing balancer. May be `balancer 
'dry_run'` or something similar to `force`. 
 # Rsgroup support should also be considered in this PR.

> Add dry run mode to hbase balancer
> --
>
> Key: HBASE-26147
> URL: https://issues.apache.org/jira/browse/HBASE-26147
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, master
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> It's often rather hard to know how the cost function changes you're making 
> will affect the balance of the cluster, and currently the only way to know is 
> to run it. If the cost decisions are not good, you may have just moved many 
> regions towards a non-ideal balance. Region moves themselves are not free for 
> clients, and the resulting balance may cause a regression.
> We should add a mode to the balancer so that it can be invoked without 
> actually executing any plans. This will allow an administrator to iterate on 
> their cost functions and used the balancer's logging to see how their changes 
> would affect the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-26147) Add dry run mode to hbase balancer

2021-07-28 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-26147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17389189#comment-17389189
 ] 

Mallikarjun commented on HBASE-26147:
-

[~bbeaudreault] This comment is not directly on feature as such.
 # Command name could be inline with existing balancer. May be `balancer 
'dry_run'` or something similar to `force`. 
 # Rsgroup support should also be considered in this PR.

> Add dry run mode to hbase balancer
> --
>
> Key: HBASE-26147
> URL: https://issues.apache.org/jira/browse/HBASE-26147
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, master
>Reporter: Bryan Beaudreault
>Assignee: Bryan Beaudreault
>Priority: Major
>
> It's often rather hard to know how the cost function changes you're making 
> will affect the balance of the cluster, and currently the only way to know is 
> to run it. If the cost decisions are not good, you may have just moved many 
> regions towards a non-ideal balance. Region moves themselves are not free for 
> clients, and the resulting balance may cause a regression.
> We should add a mode to the balancer so that it can be invoked without 
> actually executing any plans. This will allow an administrator to iterate on 
> their cost functions and used the balancer's logging to see how their changes 
> would affect the cluster. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386921#comment-17386921
 ] 

Mallikarjun edited comment on HBASE-25891 at 7/25/21, 5:08 PM:
---

[~zhangduo] Corrected the description (Agree it was confusing).  Let me know if 
there are any specific part which requires clarification

Summarizing the changes:
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`


was (Author: rda3mon):
[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the desciption.

Let me know if there any specific part which requires clarification

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format.  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> *Use Case 1.*
> *Existing Design:* To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner`. Which uses this references to clean up backed up logs.
> *New Design:*
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest 

[jira] [Updated] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Context:

Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Also, Every backup (Incremental and Full) performs a log roll just before 
taking backup and stores what was the timestamp at which log roll was performed 
per regionserver per backup using following format.  
{code:java}
// code placeholder
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
{code}
 

There are 2 cases for which WAL log refrences stored in `backup:system` and are 
being used. 

*Use Case 1.*

*Existing Design:* To cleanup WAL's for which backup is already taken using 
`BackupLogCleaner`. Which uses this references to clean up backed up logs.

*New Design:*

Since log roll timestamp is stored as part of backup per regionserver. We can 
check all previous successfull backup's and then identify which logs are to be 
retained and which ones are to be cleaned up as follows
 * Identify which are the latest successful backups performed per table.
 * Per backup identified above, identify what is the oldest log rolled 
timestamp perfomed per regionserver per table. 
 * All those WAL's which are older than oldest log rolled timestamp perfomed 
for any table backed can be removed by `BackupLogCleaner` 

 

*Use Case 2.* 

*Existing Design:* During incremental backup, to check system table if there 
are any duplicate WAL's for which backup is taken again. 

*New Design:*
 * Incremental backup already identifies which all WAL's to be backed up using 
`rslogts:` mentioned above.
 * Additionally it checks `wals:` to ensure no logs are backuped for second 
time. And this is redundant and not seen any extra benefit. 

  was:
Context:

Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Also, Every backup (Incremental and Full) performs a log roll just before 
taking backup and stores what was the timestamp at which log roll was performed 
per regionserver per backup using following format. 

 
{code:java}
// code placeholder
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
{code}
 

There are 2 cases for which WAL log 

[jira] [Comment Edited] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386921#comment-17386921
 ] 

Mallikarjun edited comment on HBASE-25891 at 7/25/21, 5:02 PM:
---

[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the desciption.

Let me know if there any specific part which requires clarification


was (Author: rda3mon):
[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.

Let me know if there any specific part which requires clarification

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there 

[jira] [Comment Edited] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386921#comment-17386921
 ] 

Mallikarjun edited comment on HBASE-25891 at 7/25/21, 5:01 PM:
---

[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.

Let me know if there any specific part which requires clarification


was (Author: rda3mon):
[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * 

[jira] [Comment Edited] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386921#comment-17386921
 ] 

Mallikarjun edited comment on HBASE-25891 at 7/25/21, 5:00 PM:
---

[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) Also list grows huge for large 
clusters like we have 300 node cluster and incremental backup can be performed 
often 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.


was (Author: rda3mon):
[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. 

[jira] [Commented] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17386921#comment-17386921
 ] 

Mallikarjun commented on HBASE-25891:
-

[~zhangduo]
 # Every regionserver WAL names are stored in `*_backup:system_*` as a 
reference used by BackupLogCleaner. This is unnecessary as every backup meta 
infromation stores timestamp at which backup was initiated (Backup WAL roll) 
and BackupLogCleaner can make use of it to clean backed up WAL logs. (I have 
given example for the same in description) 
 # `_*tableSetTimestampMap*_` field present in `_*BackupInfo*_` but missed out 
while storing in `_*backup:system*_`. This is useful for some scenarios like 
BackupLogCleaner so added it to `Backup.proto`

Correcting the title.

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Summary: Remove dependence on storing WAL filenames for backup  (was: 
Remove dependence storing WAL filenames for backup)

> Remove dependence on storing WAL filenames for backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence on storing WAL filenames for backup

2021-07-25 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Context:

Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Also, Every backup (Incremental and Full) performs a log roll just before 
taking backup and stores what was the timestamp at which log roll was performed 
per regionserver per backup using following format. 

 
{code:java}
// code placeholder
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
{code}
 

There are 2 cases for which WAL log refrences stored in `backup:system` and are 
being used. 

1. To cleanup WAL's for which backup is already taken using `BackupLogCleaner` 

Since log roll timestamp is stored as part of backup per regionserver. We can 
check all previous successfull backup's and then identify which logs are to be 
retained and which ones are to be cleaned up as follows
 * Identify which are the latest successful backups performed per table.
 * Per backup identified above, identify what is the oldest log rolled 
timestamp perfomed per regionserver per table. 
 * All those WAL's which are older than oldest log rolled timestamp perfomed 
for any table backed can be removed by `BackupLogCleaner` 

 

2. During incremental backup, to check system table if there are any duplicate 
WAL's for which backup is taken again. 
 * Incremental backup already identifies which all WAL's to be backed up using 
`rslogts:` mentioned above.
 * Additionally it checks `wals:` to ensure no logs are backuped for second 
time. And this is redundant and not seen any extra benefit. 

  was:
Context:

Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Also, Every backup (Incremental and Full) performs a log roll just before 
taking backup and stores what was the timestamp at which log roll was performed 
per regionserver per backup using following format. 

 
{code:java}
// code placeholder
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
{code}
 

 

There are 2 cases for which WAL log refrences stored in `backup:system` and are 
being used. 

1. To cleanup WAL's for which backup is already taken using `BackupLogCleaner` 

Since log 

[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-07-18 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17382994#comment-17382994
 ] 

Mallikarjun commented on HBASE-25891:
-

This will be a much needed enhancement in terms of being able to extend this 
functionality to rsgroups and things to come for hbase backup restore. Esp 
before hbase 3.0 goes live, as this changes information stored in meta. Would 
like someone to spend time in reviewing this. [~anoop.hbase] [~zhangduo]

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-06-27 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Description: Details to be filled.  (was: TODO:)

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> Details to be filled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take parallel backups

2021-06-27 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Summary: Add support to take parallel backups  (was: Add support to take 
multiple parallel backup)

> Add support to take parallel backups
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> TODO:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-27 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17370227#comment-17370227
 ] 

Mallikarjun commented on HBASE-25891:
-

[~anoop.hbase] 
[~stack] 
[~zhangduo] 

Can someone help me in getting this reviewed please. 

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-26034) Add support to take multiple parallel backup

2021-06-27 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-26034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-26034:

Description: TODO:

> Add support to take multiple parallel backup
> 
>
> Key: HBASE-26034
> URL: https://issues.apache.org/jira/browse/HBASE-26034
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-2
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-2
>
>
> TODO:



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (HBASE-26034) Add support to take multiple parallel backup

2021-06-27 Thread Mallikarjun (Jira)
Mallikarjun created HBASE-26034:
---

 Summary: Add support to take multiple parallel backup
 Key: HBASE-26034
 URL: https://issues.apache.org/jira/browse/HBASE-26034
 Project: HBase
  Issue Type: Improvement
  Components: backuprestore
Affects Versions: 3.0.0-alpha-2
Reporter: Mallikarjun
Assignee: Mallikarjun
 Fix For: 3.0.0-alpha-2






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-08 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17359302#comment-17359302
 ] 

Mallikarjun commented on HBASE-25891:
-

[~anoop.hbase] Did you get a chance to look at it?

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-05 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17358007#comment-17358007
 ] 

Mallikarjun commented on HBASE-25891:
-

I have updated the description above. Hopefully it can answer your questions 
[~anoop.hbase]. Adding details specific to questions here.
{quote}Means the WAL files will get renamed with this prefix? When those files 
become eligible for deletion then?
{quote}
No. They are cleaned up by cleanup chore. Similar to `TimeToLiveLogCleaner` 

 
{quote}Now that we dont have this systen table at all, what happens when taking 
a full/incremental snapshot? 
{quote}
Full backup does snapshot and export. There is no dependence on WAL files. 

Incremental backup continues to check `rslogts:` to see which regionserver was 
backed up until what timestamp and based on which WAL files are generated to be 
backed up.  
{quote}How WAL files been retained when backup refers to it? When that become 
eligible for deletion? (Backup deleted/ another full backup came?) And how we 
make sure we allow WAL deletion then?
{quote}
We don't need to store list of WAL files for that. We have checkpoints until 
what point WAL's are read for backup and all those WAL files created beyond 
that timestamp are eligable for backup automatically. and those created before 
that timestamp can be cleaned up. 

 

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Context:
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Also, Every backup (Incremental and Full) performs a log roll just before 
> taking backup and stores what was the timestamp at which log roll was 
> performed per regionserver per backup using following format. 
>  
> {code:java}
> // code placeholder
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
> column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
> column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
> rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
> column=meta:rs-log-ts, timestamp=1622887363275, 
> value=\x00\x00\x01y\xDB\x81\x85
> {code}
>  
>  
> There are 2 cases for which WAL log refrences stored in `backup:system` and 
> are being used. 
> 1. To cleanup WAL's for which backup is already taken using 
> `BackupLogCleaner` 
> Since log roll timestamp is stored as part of backup per regionserver. We can 
> check all previous successfull backup's and then identify which logs are to 
> be retained and which ones are to be cleaned up as follows
>  * Identify which are the latest successful backups performed per table.
>  * Per backup identified above, identify what is the oldest log rolled 
> timestamp perfomed per regionserver per table. 
>  * All those WAL's which are older than oldest log rolled timestamp perfomed 
> for any table backed can be removed by `BackupLogCleaner` 
>  
> 2. During incremental backup, to check system table if there are any 
> duplicate WAL's for which backup is taken again. 
>  * Incremental backup already identifies which all WAL's to be backed up 
> using `rslogts:` mentioned above.
>  * Additionally it checks `wals:` to ensure no logs are backuped for second 
> time. And this is redundant and not seen any extra benefit. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Context:

Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Also, Every backup (Incremental and Full) performs a log roll just before 
taking backup and stores what was the timestamp at which log roll was performed 
per regionserver per backup using following format. 

 
{code:java}
// code placeholder
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-2:16020 
column=meta:rs-log-ts, timestamp=1622887363301,value=\x00\x00\x01y\xDB\x81ar
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-3:16020 
column=meta:rs-log-ts, timestamp=1622887363294, value=\x00\x00\x01y\xDB\x81aP
rslogts:hdfs://xx.xx.xx.xx:8020/tmp/backup_yaktest\x00preprod-dn-1:16020 
column=meta:rs-log-ts, timestamp=1622887363275, value=\x00\x00\x01y\xDB\x81\x85
{code}
 

 

There are 2 cases for which WAL log refrences stored in `backup:system` and are 
being used. 

1. To cleanup WAL's for which backup is already taken using `BackupLogCleaner` 

Since log roll timestamp is stored as part of backup per regionserver. We can 
check all previous successfull backup's and then identify which logs are to be 
retained and which ones are to be cleaned up as follows
 * Identify which are the latest successful backups performed per table.
 * Per backup identified above, identify what is the oldest log rolled 
timestamp perfomed per regionserver per table. 
 * All those WAL's which are older than oldest log rolled timestamp perfomed 
for any table backed can be removed by `BackupLogCleaner` 

 

2. During incremental backup, to check system table if there are any duplicate 
WAL's for which backup is taken again. 
 * Incremental backup already identifies which all WAL's to be backed up using 
`rslogts:` mentioned above.
 * Additionally it checks `wals:` to ensure no logs are backuped for second 
time. And this is redundant and not seen any extra benefit. 

  was:
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Primarily used for following
 * WAL's are stored in meta table to check if a particular log has been backed 
up or not.
 * Check during incremental backup if a particular WAL is being backed up was 
covered during previous incremental backup or not.

Changes for above 2 use cases.
 * Since log roll during incremental or full backup is stored with prefix 
`trslm:`. Can be used to identify which log files can be cleaned up
 * Check during incremental backup if a particular WAL is being backed up or 
not is redundant. No such a check is required

 


> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
> 

[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-05 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17357912#comment-17357912
 ] 

Mallikarjun commented on HBASE-25891:
-

[~anoop.hbase] I have made certain changes to this PR (not relating to the 
multi tenancy scope earlier planned) and updated description accordingly what 
the changes are. Please let me know if this is sufficient.

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Primarily used for following
>  * WAL's are stored in meta table to check if a particular log has been 
> backed up or not.
>  * Check during incremental backup if a particular WAL is being backed up was 
> covered during previous incremental backup or not.
> Changes for above 2 use cases.
>  * Since log roll during incremental or full backup is stored with prefix 
> `trslm:`. Can be used to identify which log files can be cleaned up
>  * Check during incremental backup if a particular WAL is being backed up or 
> not is redundant. No such a check is required
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-05 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Primarily used for following
 * WAL's are stored in meta table to check if a particular log has been backed 
up or not.
 * Check during incremental backup if a particular WAL is being backed up was 
covered during previous incremental backup or not.

Changes for above 2 use cases.
 * Since log roll during incremental or full backup is stored with prefix 
`trslm:`. Can be used to identify which log files can be cleaned up
 * Check during incremental backup if a particular WAL is being backed up or 
not is redundant. No such a check is required

 

  was:
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence tables belonging to rsgroups which doesn't 
have backup enabled also have to retain wals' and forever.

 

Proposed Solution:

 

 


> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Primarily used for following
>  * WAL's are stored in meta table to check if a particular log has been 
> backed up or not.
>  * Check during incremental backup if a particular WAL is being backed up was 
> covered 

[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-06-03 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence tables belonging to rsgroups which doesn't 
have backup enabled also have to retain wals' and forever.

 

Proposed Solution:

 

 

  was:
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence all rsgroups which doesn't have backup enabled 
tables, WAL's are retained forever.

 


> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence tables belonging to rsgroups which doesn't 
> have backup enabled also have to retain wals' and forever.
>  
> Proposed Solution:
>  
>  




[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-06-03 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Description: 
*Existing Design*

*!image-2021-06-03-16-33-59-282.png|width=292,height=408!*

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

Proposed Design.

!https://i.ibb.co/vVV1BTs/Backup-Activity-Diagram.png|width=322,height=414!

  was:
*Problem 1:* 
With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
 
Example: 
Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes and 
you are allowed to ship the remote backup with 800 Mbps. And you are allowed to 
take Full Backups once in a week and rest of them should be incremental backups
 
Shortcoming: With the above design, one can't run parallel backups and whenever 
there is a full backup running (which takes roughly 25 hours) you are not 
allowed to take incremental backups and that would be a breach in your RPO. 
 
*Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
 
*Problem 2:*
With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
 
*Problem 3:* 
Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
*Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
 

Proposed Design.


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-06-03 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Description: 
*Existing Design*

*!image-2021-06-03-16-33-59-282.png|width=292,height=408!*

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily mentioned above. 
  

*Proposed Design.*

!image-2021-06-03-16-34-34-957.png|width=324,height=416!

  was:
*Existing Design*

*!image-2021-06-03-16-33-59-282.png|width=292,height=408!*

*Problem 1:* 
 With this design, Incremental and Full backup can't be run in parallel and 
leading to degraded RPO's in case Full backup is of longer duration esp for 
large tables.
  
 Example: 
 Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
and you are allowed to ship the remote backup with 800 Mbps. And you are 
allowed to take Full Backups once in a week and rest of them should be 
incremental backups
  
 Shortcoming: With the above design, one can't run parallel backups and 
whenever there is a full backup running (which takes roughly 25 hours) you are 
not allowed to take incremental backups and that would be a breach in your RPO. 
  
 *Proposed Solution:* Barring some critical sections such as modifying state of 
the backup on meta tables, others can happen parallelly. Leaving incremental 
backups to be able to run based on older successful full / incremental backups 
and completion time of backup should be used instead of start time of backup 
for ordering. I have not worked on the full redesign, and will be doing so if 
this proposal seems acceptable for the community.
  
 *Problem 2:*
 With one backup at a time, it fails easily for a multi-tenant system. This 
poses following problems
 * Admins will not be able to achieve required RPO's for their tables because 
of dependence on other tenants present in the system. As one tenant doesn't 
have control over other tenants' table sizes and hence the duration of the 
backup
 * Management overhead of setting up a right sequence to achieve required RPO's 
for different tenants could be very hard.

*Proposed Solution:* Same as previous proposal
  
 *Problem 3:* 
 Incremental backup works on WAL's and 
org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
never cleaned up until the next backup (Full / Incremental) is taken. This 
poses following problem
 * WAL's can grow unbounded in case there are transient problems like backup 
site facing issues or anything else until next backup scheduled goes successful
 *Proposed Solution:* I can't think of anything better, but I see this can be a 
potential problem. Also, one can force full backup if required WAL files are 
missing for whatever other reasons not necessarily 

[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-06-03 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: image-2021-06-03-16-34-34-957.png

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
> Attachments: image-2021-06-03-16-33-59-282.png, 
> image-2021-06-03-16-34-34-957.png
>
>
> *Existing Design*
> *!image-2021-06-03-16-33-59-282.png|width=292,height=408!*
> *Problem 1:* 
>  With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>   
>  Example: 
>  Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>   
>  Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>   
>  *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>   
>  *Problem 2:*
>  With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>   
>  *Problem 3:* 
>  Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
>  *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>   
> Proposed Design.
> !https://i.ibb.co/vVV1BTs/Backup-Activity-Diagram.png|width=322,height=414!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25784) Support for Parallel Backups enabling multi tenancy with rsgroups

2021-06-03 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25784:

Attachment: image-2021-06-03-16-33-59-282.png

> Support for Parallel Backups enabling multi tenancy with rsgroups
> -
>
> Key: HBASE-25784
> URL: https://issues.apache.org/jira/browse/HBASE-25784
> Project: HBase
>  Issue Type: Umbrella
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>  Labels: backup
> Attachments: image-2021-06-03-16-33-59-282.png, 
> image-2021-06-03-16-34-34-957.png
>
>
> *Problem 1:* 
> With this design, Incremental and Full backup can't be run in parallel and 
> leading to degraded RPO's in case Full backup is of longer duration esp for 
> large tables.
>  
> Example: 
> Expectation: Say you have a big table with 10 TB and your RPO is 60 minutes 
> and you are allowed to ship the remote backup with 800 Mbps. And you are 
> allowed to take Full Backups once in a week and rest of them should be 
> incremental backups
>  
> Shortcoming: With the above design, one can't run parallel backups and 
> whenever there is a full backup running (which takes roughly 25 hours) you 
> are not allowed to take incremental backups and that would be a breach in 
> your RPO. 
>  
> *Proposed Solution:* Barring some critical sections such as modifying state 
> of the backup on meta tables, others can happen parallelly. Leaving 
> incremental backups to be able to run based on older successful full / 
> incremental backups and completion time of backup should be used instead of 
> start time of backup for ordering. I have not worked on the full redesign, 
> and will be doing so if this proposal seems acceptable for the community.
>  
> *Problem 2:*
> With one backup at a time, it fails easily for a multi-tenant system. This 
> poses following problems
>  * Admins will not be able to achieve required RPO's for their tables because 
> of dependence on other tenants present in the system. As one tenant doesn't 
> have control over other tenants' table sizes and hence the duration of the 
> backup
>  * Management overhead of setting up a right sequence to achieve required 
> RPO's for different tenants could be very hard.
> *Proposed Solution:* Same as previous proposal
>  
> *Problem 3:* 
> Incremental backup works on WAL's and 
> org.apache.hadoop.hbase.backup.master.BackupLogCleaner ensures that WAL's are 
> never cleaned up until the next backup (Full / Incremental) is taken. This 
> poses following problem
>  * WAL's can grow unbounded in case there are transient problems like backup 
> site facing issues or anything else until next backup scheduled goes 
> successful
> *Proposed Solution:* I can't think of anything better, but I see this can be 
> a potential problem. Also, one can force full backup if required WAL files 
> are missing for whatever other reasons not necessarily mentioned above. 
>  
> Proposed Design.
> !https://i.ibb.co/vVV1BTs/Backup-Activity-Diagram.png|width=322,height=414!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-05-31 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17354430#comment-17354430
 ] 

Mallikarjun commented on HBASE-25891:
-

[~anoop.hbase] Not there completely. Let me put down the details and share it.

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled tables, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-05-30 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Fix Version/s: 3.0.0-alpha-1

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled tables, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-05-30 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Affects Version/s: 3.0.0-alpha-1

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Affects Versions: 3.0.0-alpha-1
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled tables, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-05-30 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence all rsgroups which doesn't have backup enabled 
tables, WAL's are retained forever.

 

  was:
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence all rsgroups which doesn't have backup 
enabled, WAL's are retained forever.

 


> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled tables, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove dependence storing WAL filenames for backup

2021-05-26 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Summary: Remove dependence storing WAL filenames for backup  (was: Remove 
the dependence of storing WAL filenames for incremental backup)

> Remove dependence storing WAL filenames for backup
> --
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove the dependence of storing WAL filenames for incremental backup

2021-05-26 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence all rsgroups which doesn't have backup 
enabled, WAL's are retained forever.

 

  was:
Currently WAL logs are stored in `backup:system` meta table 

 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

 

 
 # 
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or log 
cle

 


> Remove the dependence of storing WAL filenames for incremental backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
> This has several problems
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> for performing logcleaner.
>  # No support for rsgroup. Hence all rsgroups which doesn't have backup 
> enabled, WAL's are retained forever.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove the dependence of storing WAL filenames for incremental backup

2021-05-26 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 

 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

 

 
 # 
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or log 
cle

 

  was:
Currently WAL logs are stored in `backup:system` meta table 

 
{code:java}
// code placeholder

{code}
 

 

 
 # 
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or log 
cle

 


> Remove the dependence of storing WAL filenames for incremental backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Currently WAL logs are stored in `backup:system` meta table 
>  
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
>  
>  
>  
>  # 
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> log cle
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove the dependence of storing WAL filenames for incremental backup

2021-05-26 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Description: 
Currently WAL logs are stored in `backup:system` meta table 

 
{code:java}
// code placeholder

{code}
 

 

 
 # 
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or log 
cle

 

  was:
Here are some of the problems identified.

1. Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 

2. Backup sessions, Backup sets, wal logs, active sessions, merges, etc are all 
identified by prefix of row key. Which doesn't seem to be very intuitive


> Remove the dependence of storing WAL filenames for incremental backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Currently WAL logs are stored in `backup:system` meta table 
>  
> {code:java}
> // code placeholder
> {code}
>  
>  
>  
>  # 
>  # Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
>  # Unnecessary to have wal log listed for performing incremental backup or 
> log cle
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (HBASE-25891) Remove the dependence of storing WAL filenames for incremental backup

2021-05-26 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:

Summary: Remove the dependence of storing WAL filenames for incremental 
backup  (was: Simplify backup table to be able to maintain it better)

> Remove the dependence of storing WAL filenames for incremental backup
> -
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
>  Issue Type: Improvement
>  Components: backuprestore
>Reporter: Mallikarjun
>Assignee: Mallikarjun
>Priority: Major
>
> Here are some of the problems identified.
> 1. Ever growing rows of wal's sourced for incremental backup is maintained 
> and never cleaned up. 
> 2. Backup sessions, Backup sets, wal logs, active sessions, merges, etc are 
> all identified by prefix of row key. Which doesn't seem to be very intuitive



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25888) Backup tests are categorically flakey

2021-05-19 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347587#comment-17347587
 ] 

Mallikarjun commented on HBASE-25888:
-

[~ndimiduk] You can have a look at PR whenever you have time. 

> Backup tests are categorically flakey
> -
>
> Key: HBASE-25888
> URL: https://issues.apache.org/jira/browse/HBASE-25888
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore, test
>Reporter: Nick Dimiduk
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
> Attachments: 
> TEST-org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestBackupMerge.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestFullBackupSet.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.xml.gz,
>  
> TEST-org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.xml.gz, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt.gz, 
> org.apache.hadoop.hbase.backup.TestBackupMerge-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestBackupMerge.txt.gz, 
> org.apache.hadoop.hbase.backup.TestFullBackupSet-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.txt.gz, 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures-output.txt.gz,
>  
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.txt.gz, 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad-output.txt.gz,
>  org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.txt.gz, 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.txt.gz
>
>
> Here's some logs from a PR build vs. master that suffered a significant 
> number of failures in the backup tests. I suspect that a single improvement 
> could fix all of these tests to be more robust.
> {noformat}
> Test Name
> Duration
> Age
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore
> 6 min 23 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.(?)1 min 6 sec 
> 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.TestIncBackupMergeRestore  
> 5 min 3 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.(?)1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.testFullBackupSetExist   
> 6 min 16 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.(?)  1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.TestIncBackupMergeRestore
>5 min 55 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.(?) 
> 1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable
>  5 min 56 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.(?)  1 min 6 
> sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreSingleEmpty
> 6 min 5 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreMultipleEmpty
>   0.17 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.(?)
> {noformat}
> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3249/4/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-25888) Backup tests are categorically flakey

2021-05-18 Thread Mallikarjun (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17347230#comment-17347230
 ] 

Mallikarjun commented on HBASE-25888:
-

It is as simple as `Setup` and `TearDown` problem. When you run them 
individually, run fine but fail as a suite. 

I have fixed for most of the tests, there are pending few. looking into them. 

> Backup tests are categorically flakey
> -
>
> Key: HBASE-25888
> URL: https://issues.apache.org/jira/browse/HBASE-25888
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore, test
>Reporter: Nick Dimiduk
>Assignee: Mallikarjun
>Priority: Major
> Fix For: 3.0.0-alpha-1
>
> Attachments: 
> TEST-org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestBackupMerge.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestFullBackupSet.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.xml.gz,
>  
> TEST-org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.xml.gz, 
> TEST-org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.xml.gz, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt.gz, 
> org.apache.hadoop.hbase.backup.TestBackupMerge-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestBackupMerge.txt.gz, 
> org.apache.hadoop.hbase.backup.TestFullBackupSet-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.txt.gz, 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures-output.txt.gz,
>  
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.txt.gz, 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad-output.txt.gz,
>  org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.txt.gz, 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests-output.txt.gz, 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.txt.gz
>
>
> Here's some logs from a PR build vs. master that suffered a significant 
> number of failures in the backup tests. I suspect that a single improvement 
> could fix all of these tests to be more robust.
> {noformat}
> Test Name
> Duration
> Age
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore
> 6 min 23 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.(?)1 min 6 sec 
> 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.TestIncBackupMergeRestore  
> 5 min 3 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.(?)1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.testFullBackupSetExist   
> 6 min 16 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.(?)  1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.TestIncBackupMergeRestore
>5 min 55 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.(?) 
> 1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable
>  5 min 56 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.(?)  1 min 6 
> sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreSingleEmpty
> 6 min 5 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreMultipleEmpty
>   0.17 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.(?)
> {noformat}
> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3249/4/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work started] (HBASE-25888) Backup tests are categorically flakey

2021-05-18 Thread Mallikarjun (Jira)


 [ 
https://issues.apache.org/jira/browse/HBASE-25888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-25888 started by Mallikarjun.
---
> Backup tests are categorically flakey
> -
>
> Key: HBASE-25888
> URL: https://issues.apache.org/jira/browse/HBASE-25888
> Project: HBase
>  Issue Type: Bug
>  Components: backuprestore, test
>Reporter: Nick Dimiduk
>Assignee: Mallikarjun
>Priority: Major
> Attachments: 
> TEST-org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.xml.gz, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt, 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.txt.gz
>
>
> Here's some logs from a PR build vs. master that suffered a significant 
> number of failures in the backup tests. I suspect that a single improvement 
> could fix all of these tests to be more robust.
> {noformat}
> Test Name
> Duration
> Age
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.testBackupDeleteRestore
> 6 min 23 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupDeleteRestore.(?)1 min 6 sec 
> 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.TestIncBackupMergeRestore  
> 5 min 3 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestBackupMerge.(?)1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.testFullBackupSetExist   
> 6 min 16 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestFullBackupSet.(?)  1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.TestIncBackupMergeRestore
>5 min 55 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupMergeWithFailures.(?) 
> 1 min 6 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.TestIncBackupDeleteTable
>  5 min 56 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestIncrementalBackupWithBulkLoad.(?)  1 min 6 
> sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreSingleEmpty
> 6 min 5 sec 1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.testFullRestoreMultipleEmpty
>   0.17 sec1
>  precommit checks / yetus jdk8 Hadoop3 checks / 
> org.apache.hadoop.hbase.backup.TestRestoreBoundaryTests.(?)
> {noformat}
> https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3249/4/testReport/



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


  1   2   >