Triage recovery of damaged Subversion repo

Michael K Fri, 28 Oct 2022 15:25:40 -0700

Hi, I could really use some direction, advice, etc. I came across some past
messages that made me think some might be able to help with this. I'd
appreciate any input!


I am working on an important Subversion repository that was hit by a
targeted ransomware attack. Apparently the backups were deleted securely as
well, though there is a backup from a few years back that was unaffected in
different storage. In brief, the ransomware encrypted and overwrote (up to)
the first 4 KB of data and also added some encrypted data and zero-padding
to the end of every file. Since Subversion has many small files, the data
has been slashed up badly and some is gone forever. But files larger than 4
KB have original data remaining.

My goal is to build a working repository with as much of the original data
that is remaining as I can, like a triage operation. I have a backup that
was not affected, but it does not contain the last few years of data. I
need to utilize the data that is affected by ransomware encryption.

Eventually I plan to write a program that will work over all the affected
revs and revprops files required and output new files. I'm coming at this
without previous knowledge of the inner workings of Subversion, but I am
comfortable working in a hex editor and writing programs that process raw
data. So for now, I have been learning about Subversion from reading the
documentation and while working hands-on with the raw data of these files
in a hex editor. I've learned a bit about the "representations" within the
revs files. That will probably be helpful since those provide units that
each revs file can be broken down into. I can use that knowledge to try
keeping full "representations" and discard partial ones.

Currently, I am trying to add a single new empty revision that Subversion
will accept after testing with the "svnadmin verify" and "svn info"
commands. I fabricated data for a revprops file on this new revision, I
adjusted the "current" file to the new revision number, and I'm working on
the revs file. If I can achieve that, I'll move on to adding a new revision
that contains some original data.

I've learned about the footer of the revs files as I've come across errors
when trying those commands. I know how the L2P_OFFSET and P2L_OFFSET work
and I have remedied the errors when those offsets are incorrect. I also
discovered some kind of item indexes from logical addressing (I think, not
sure what they are called) which occur right after both "L2P_OFFSET" and
"P2L_OFFSET" in the revs files. By looking at many files, I figured out how
to calculate the binary representation for that based on the rev number
(strange calculation). That got me past the error such as - "svn: E160054:
Index rev / pack file revision numbers do not match" - from the svn info
command.

And now I'm trying to get past the "L2P index checksum mismatch" error. I
don't know yet how the "actual" checksum value is calculated. Thankfully
Subversion's error message shows both the "expected" and "actual"
checksums. So I've tried taking an MD5 hash on byte ranges of the L2P-INDEX
area (and variations), but haven't gotten a match to that "actual" value
yet.

If you could provide insight to where these 2 checksums come from, I'd be
really grateful. Also, any other general thoughts on this project would be
appreciated.

Michael

Triage recovery of damaged Subversion repo

Reply via email to