[
https://issues.apache.org/jira/browse/HBASE-25891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mallikarjun updated HBASE-25891:
--------------------------------
Description:
Currently WAL logs are stored in `backup:system` meta table
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId,
timestamp=1622003479895, value=backup_1622003358258
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file,
timestamp=1622003479895,
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root,
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId,
timestamp=1622003479895, value=backup_1622003358258
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file,
timestamp=1622003479895,
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root,
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
{code}
Primarily used for following
* WAL's are stored in meta table to check if a particular log has been backed
up or not.
* Check during incremental backup if a particular WAL is being backed up was
covered during previous incremental backup or not.
Changes for above 2 use cases.
* Since log roll during incremental or full backup is stored with prefix
`trslm:`. Can be used to identify which log files can be cleaned up
* Check during incremental backup if a particular WAL is being backed up or
not is redundant. No such a check is required
was:
Currently WAL logs are stored in `backup:system` meta table
{code:java}
// code placeholder
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId,
timestamp=1622003479895, value=backup_1622003358258
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file,
timestamp=1622003479895,
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root,
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId,
timestamp=1622003479895, value=backup_1622003358258
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file,
timestamp=1622003479895,
value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root,
timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
{code}
This has several problems
# Ever growing rows of wal's sourced for incremental backup is maintained and
never cleaned up.
# Unnecessary to have wal log listed for performing incremental backup or for
performing logcleaner.
# No support for rsgroup. Hence tables belonging to rsgroups which doesn't
have backup enabled also have to retain wals' and forever.
Proposed Solution:
> Remove dependence storing WAL filenames for backup
> --------------------------------------------------
>
> Key: HBASE-25891
> URL: https://issues.apache.org/jira/browse/HBASE-25891
> Project: HBase
> Issue Type: Improvement
> Components: backup&restore
> Affects Versions: 3.0.0-alpha-1
> Reporter: Mallikarjun
> Assignee: Mallikarjun
> Priority: Major
> Fix For: 3.0.0-alpha-1
>
>
> Currently WAL logs are stored in `backup:system` meta table
> {code:java}
> // code placeholder
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:backupId,
> timestamp=1622003479895, value=backup_1622003358258
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:file,
> timestamp=1622003479895,
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621996160175
> wals:preprod-dn-1%2C16020%2C1614844389000.1621996160175 column=meta:root,
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:backupId,
> timestamp=1622003479895, value=backup_1622003358258
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:file,
> timestamp=1622003479895,
> value=hdfs://store/hbase/oldWALs/preprod-dn-1%2C16020%2C1614844389000.1621999760280
> wals:preprod-dn-1%2C16020%2C1614844389000.1621999760280 column=meta:root,
> timestamp=1622003479895, value=s3a://2021-05-25--21-45-00--full/set1
> {code}
> Primarily used for following
> * WAL's are stored in meta table to check if a particular log has been
> backed up or not.
> * Check during incremental backup if a particular WAL is being backed up was
> covered during previous incremental backup or not.
> Changes for above 2 use cases.
> * Since log roll during incremental or full backup is stored with prefix
> `trslm:`. Can be used to identify which log files can be cleaned up
> * Check during incremental backup if a particular WAL is being backed up or
> not is redundant. No such a check is required
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)