Hi, I could really use some direction, advice, etc. I came across some past messages that made me think some might be able to help with this. I'd appreciate any input!
I am working on an important Subversion repository that was hit by a targeted ransomware attack. Apparently the backups were deleted securely as well, though there is a backup from a few years back that was unaffected in different storage. In brief, the ransomware encrypted and overwrote (up to) the first 4 KB of data and also added some encrypted data and zero-padding to the end of every file. Since Subversion has many small files, the data has been slashed up badly and some is gone forever. But files larger than 4 KB have original data remaining. My goal is to build a working repository with as much of the original data that is remaining as I can, like a triage operation. I have a backup that was not affected, but it does not contain the last few years of data. I need to utilize the data that is affected by ransomware encryption. Eventually I plan to write a program that will work over all the affected revs and revprops files required and output new files. I'm coming at this without previous knowledge of the inner workings of Subversion, but I am comfortable working in a hex editor and writing programs that process raw data. So for now, I have been learning about Subversion from reading the documentation and while working hands-on with the raw data of these files in a hex editor. I've learned a bit about the "representations" within the revs files. That will probably be helpful since those provide units that each revs file can be broken down into. I can use that knowledge to try keeping full "representations" and discard partial ones. Currently, I am trying to add a single new empty revision that Subversion will accept after testing with the "svnadmin verify" and "svn info" commands. I fabricated data for a revprops file on this new revision, I adjusted the "current" file to the new revision number, and I'm working on the revs file. If I can achieve that, I'll move on to adding a new revision that contains some original data. I've learned about the footer of the revs files as I've come across errors when trying those commands. I know how the L2P_OFFSET and P2L_OFFSET work and I have remedied the errors when those offsets are incorrect. I also discovered some kind of item indexes from logical addressing (I think, not sure what they are called) which occur right after both "L2P_OFFSET" and "P2L_OFFSET" in the revs files. By looking at many files, I figured out how to calculate the binary representation for that based on the rev number (strange calculation). That got me past the error such as - "svn: E160054: Index rev / pack file revision numbers do not match" - from the svn info command. And now I'm trying to get past the "L2P index checksum mismatch" error. I don't know yet how the "actual" checksum value is calculated. Thankfully Subversion's error message shows both the "expected" and "actual" checksums. So I've tried taking an MD5 hash on byte ranges of the L2P-INDEX area (and variations), but haven't gotten a match to that "actual" value yet. If you could provide insight to where these 2 checksums come from, I'd be really grateful. Also, any other general thoughts on this project would be appreciated. Michael