[
https://issues.apache.org/jira/browse/HBASE-28082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17832931#comment-17832931
]
Bryan Beaudreault commented on HBASE-28082:
-------------------------------------------
I lost track of this, but just ran into the issue in our environment and was
pleased to find it solved. Thanks [~janvanbesien]!
> oldWALs naming can be incompatible with HBase backup
> ----------------------------------------------------
>
> Key: HBASE-28082
> URL: https://issues.apache.org/jira/browse/HBASE-28082
> Project: HBase
> Issue Type: Bug
> Components: backup&restore
> Environment: Encountered on HBase
> a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes.
> Reporter: Dieter De Paepe
> Assignee: Jan Van Besien
> Priority: Major
> Fix For: 2.6.0, 3.0.0-beta-1
>
>
> I am testing HBase backup functionality, and noticed following warning when
> running "hbase backup create incremental ...":
>
> {noformat}
> 23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip
> log file (can't parse):
> hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat}
> It appears in my setup, the oldWALs are indeed given names that seem to break
> "ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)":
>
>
> {noformat}
> user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs
> Found 42 items
> -rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14
> hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$
> -rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29
> hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$
> ...
> -rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16
> hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312
> -rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16
> hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654
> ...
> -rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15
> hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta
> -rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04
> hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta
> ...{noformat}
> I'd say this is not a bug in the backup system, but rather in whatever gives
> the oldWAL files its name. I'm however not that familiar with HBase code to
> find where these files are created. Any pointers are appreciated.
> Given that this causes some logs to be missed during backup, I guess this can
> lead to data loss in a backup restore?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)