Dieter De Paepe created HBASE-28082:
---------------------------------------
Summary: oldWALs naming can be incompatible with HBase backup
Key: HBASE-28082
URL: https://issues.apache.org/jira/browse/HBASE-28082
Project: HBase
Issue Type: Bug
Environment: Encountered on HBase
a2e7d2015e9f603e46339d0582e29a86843b9324 (branch-2), running in Kubernetes.
Reporter: Dieter De Paepe
I am testing HBase backup functionality, and noticed following warning when
running "hbase backup create incremental ...":
{noformat}
23/09/13 15:44:10 WARN org.apache.hadoop.hbase.backup.util.BackupUtils: Skip
log file (can't parse):
hdfs://hdfsns/hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312{noformat}
It appears in my setup, the oldWALs are indeed given names that seem to break
"ServerName.valueOf(s)" in "BackupUtils#parseHostFromOldLog(Path p)":
{noformat}
user@hadoop-client-769bc9946-xqrt2:/$ hdfs dfs -ls hdfs:///hbase/hbase/oldWALs
Found 42 items
-rw-r--r-- 1 hbase hbase 775421 2023-09-13 13:14
hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694609957984$masterlocalwal$
-rw-r--r-- 1 hbase hbase 26059 2023-09-13 13:29
hdfs:///hbase/hbase/oldWALs/hbase-master-0.minikube-shared%2C16000%2C1694609954719.hbase-master-0.minikube-shared%2C16000%2C1694609954719.regiongroup-0.1694610867894$masterlocalwal$
...
-rw-r--r-- 1 hbase hbase 242479 2023-09-13 14:16
hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694609969312
-rw-r--r-- 1 hbase hbase 4364 2023-09-13 14:16
hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.regiongroup-0.1694610188654
...
-rw-r--r-- 1 hbase hbase 70802 2023-09-13 13:15
hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694609970025.meta
-rw-r--r-- 1 hbase hbase 93 2023-09-13 13:04
hdfs:///hbase/hbase/oldWALs/hbase-region-0.hbase-region.minikube-shared.svc.cluster.local%2C16020%2C1694609964681.meta.1694610188627.meta
...{noformat}
I'd say this is not a bug in the backup system, but rather in whatever gives
the oldWAL files its name. I'm however not that familiar with HBase code to
find where these files are created. Any pointers are appreciated.
Given that this causes some logs to be missed during backup, I guess this can
lead to data loss in a backup restore?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)