Hi Darin, I'm afraid it's very difficult to fix the corruption, the only way is to rewrite the whole rocksdb's MANIFEST and remove that file, or rewrite the SST file. Either way there will be some data loss. Or if you have enabled the local recovery, you may find a local copy of that checkpoint file, which can be used to replace the corresponding file on DFS. Or perhaps your corrupted file itself comes from the local copy, then disabling local recovery may help.
It is rare, and I guess it is caused by some DFS failure or disk corruption. You can keep an eye on that. Best, Zakelly On Wed, Feb 4, 2026 at 12:03 PM Darin Amos via user <[email protected]> wrote: > Hi! > > I have a problem where my incremental checkpoint has a corrupt SST file > that was created weeks ago, meaning going back in time to replay the data > to fix the corruption is not possible, and re-bootstrapping the job is > extremely difficult. > > Is there a way to patch the corrupt SST file to fix my job? In this > particular case some data loss is acceptable in favour of system health. > > Thanks! > > Darin > > > % $(brew --prefix rocksdb)/bin/rocksdb_sst_dump \ > > > --file=./checkpoint_verification/sst_files/06240ecd-9154-409b-8a32-3a0ebd8e64de.sst > \ > > --command=verify --verify_checksum > > options.env is 0x600003f638e0 > > Process > ./checkpoint_verification/sst_files/06240ecd-9154-409b-8a32-3a0ebd8e64de.sst > > Sst file format: block-based > > ./checkpoint_verification/sst_files/06240ecd-9154-409b-8a32-3a0ebd8e64de.sst > is corrupted: Corruption: block checksum mismatch: stored = 3954219857, > computed = 4054404265, type = 1 in > ./checkpoint_verification/sst_files/06240ecd-9154-409b-8a32-3a0ebd8e64de.sst > offset 84885876 size 11204 > > >
