[
https://issues.apache.org/jira/browse/HBASE-28998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17901207#comment-17901207
]
Duo Zhang commented on HBASE-28998:
-----------------------------------
[~heliangjun] FYI.
{quote}
But do you know why it is not working with COMPOSITE_CRC ?
{quote}
I guess the problem is that the comparison is done in HBase code, not in
hadoop, and we do not honor the COMPOSITE_CRC flag...
> Backup support for S3 broken by checksum added in HBASE-28625
> -------------------------------------------------------------
>
> Key: HBASE-28998
> URL: https://issues.apache.org/jira/browse/HBASE-28998
> Project: HBase
> Issue Type: Bug
> Reporter: Ruben Van Wanzeele
> Priority: Major
>
> With the 2.6.1 version, we fail to backup to S3 because of the introduction
> of the checksum validation (introduced with HBASE-28625).
> Stacktrace:
> {code:java}
> Error: java.io.IOException: Checksum mismatch between
> hdfs://hdfsns/hbase/hbase/data/SYSTEM/CATALOG/b884434cc05aae3a21c0d0723173ce02/0/43b0e3a7b608441eab7dbce2782511bf
> and
> s3a://product-eks-v2-brt-master-574-backup/hbase/backup_1732545852227/SYSTEM/CATALOG/archive/data/SYSTEM/CATALOG/b884434cc05aae3a21c0d0723173ce02/0/43b0e3a7b608441eab7dbce2782511bf.
> Input and output filesystems are of different types.
> Their checksum algorithms may be incompatible. You can choose file-level
> checksum validation via -Ddfs.checksum.combine.mode=COMPOSITE_CRC when
> block-sizes or filesystems are different.
> Or you can skip checksum-checks altogether with -no-checksum-verify, for the
> table backup scenario, you should use -i option to skip checksum-checks.
> (NOTE: By skipping checksums, one runs the risk of masking data-corruption
> during file-transfer.) at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.verifyCopyResult(ExportSnapshot.java:596)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.copyFile(ExportSnapshot.java:332)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:254)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot$ExportMapper.map(ExportSnapshot.java:180)
>
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:800)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:348)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:178)
> at
> java.base/java.security.AccessController.doPrivileged(AccessController.java:714)
> at java.base/javax.security.auth.Subject.doAs(Subject.java:525)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1953)
> at
> org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:172)2024-11-25
> 14:47:42,715 ERROR org.apache.hadoop.hbase.snapshot.ExportSnapshot: Snapshot
> export failed
> org.apache.hadoop.hbase.snapshot.ExportSnapshotException: Task failed
> task_1732545054899_0007_m_000001
> Job failed as tasks failed. failedMaps:1 failedReduces:0 killedMaps:0
> killedReduces: 0 at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.runCopyJob(ExportSnapshot.java:935)
> at
> org.apache.hadoop.hbase.snapshot.ExportSnapshot.doWork(ExportSnapshot.java:1204)
> at
> org.apache.hadoop.hbase.util.AbstractHBaseTool.run(AbstractHBaseTool.java:151)
> at
> org.apache.hadoop.hbase.backup.mapreduce.MapReduceBackupCopyJob.copy(MapReduceBackupCopyJob.java:382)
> at
> org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.snapshotCopy(FullTableBackupClient.java:113)
> at
> org.apache.hadoop.hbase.backup.impl.FullTableBackupClient.execute(FullTableBackupClient.java:175)
> at
> org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.backupTables(BackupAdminImpl.java:594){code}
> I think the solution is to only do the checksum validation if the filesystems
> are the same.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)