[ 
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mallikarjun updated HBASE-25891:
--------------------------------
    Description: 
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
Primarily used for following
 * WAL's are stored in meta table to check if a particular log has been backed 
up or not.
 * Check during incremental backup if a particular WAL is being backed up was 
covered during previous incremental backup or not.

Changes for above 2 use cases.
 * Since log roll during incremental or full backup is stored with prefix 
`trslm:`. Can be used to identify which log files can be cleaned up
 * Check during incremental backup if a particular WAL is being backed up or 
not is redundant. No such a check is required

 

  was:
Currently WAL logs are stored in `backup:system` meta table 
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
 wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
timestamp=1622003479895, value=backup_1622003358258 
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
timestamp=1622003479895, 
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
 wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
{code}
 

This has several problems
 # Ever growing rows of wal's sourced for incremental backup is maintained and 
never cleaned up. 
 # Unnecessary to have wal log listed for performing incremental backup or for 
performing logcleaner.
 # No support for rsgroup. Hence tables belonging to rsgroups which doesn't 
have backup enabled also have to retain wals' and forever.

 

Proposed Solution:

 

 


> Remove dependence storing WAL filenames for backup
> --------------------------------------------------
>
>                 Key: HBASE-25891
>                 URL: https://issues.apache.org/jira/browse/HBASE-25891
>             Project: HBase
>          Issue Type: Improvement
>          Components: backup&restore
>    Affects Versions: 3.0.0-alpha-1
>            Reporter: Mallikarjun
>            Assignee: Mallikarjun
>            Priority: Major
>             Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table 
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId, 
> timestamp=1622003479895, value=backup_1622003358258 
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file, 
> timestamp=1622003479895, 
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
>  wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root, 
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1 
> {code}
> Primarily used for following
>  * WAL's are stored in meta table to check if a particular log has been 
> backed up or not.
>  * Check during incremental backup if a particular WAL is being backed up was 
> covered during previous incremental backup or not.
> Changes for above 2 use cases.
>  * Since log roll during incremental or full backup is stored with prefix 
> `trslm:`. Can be used to identify which log files can be cleaned up
>  * Check during incremental backup if a particular WAL is being backed up or 
> not is redundant. No such a check is required
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to