On Thu, Apr 16, 2026 at 9:58 AM Daniel Sahlberg
<[email protected]> wrote:
>
> Den tors 16 apr. 2026 kl 09:23 skrev Matthew Turner <[email protected]>:
>>
>> Hello:
>>
>> I recently came across https://issues.apache.org/jira/browse/SVN-516 and am 
>> wondering if there is still interest in this due to the last commit being in 
>> 2016.
>>
>> My interest in this stems from wanting to archive older commits that are 
>> unlikely to be accessed to an external disk or other medium, and then free 
>> up the space on the server. That way it still adheres to the principle of an 
>> unmodified history, just with less immediate accessibility. Note that I want 
>> to preserve the UUID and revision numbers so a dump/load cycle is not as 
>> favourable as this feature would be.
>>
>> I have an rough idea of how this may be done but I have never touched SVN 
>> source before and struggle to figure out what APIs would be relevant to the 
>> FSFS structure I found 
>> (https://svn.apache.org/repos/asf/subversion/trunk/subversion/libsvn_fs_fs/structure)
>>  and am not sure I have room in my life to implement it at this time. My 
>> idea is, when packing a shard to also squash (similar to GIT squash 
>> https://git-scm.com/docs/git-merge#Documentation/git-merge.txt---squash) the 
>> commits together for a smaller delta. Remember that while this history is 
>> "lost" in the squash, it is supposed to have been archived elsewhere that 
>> enables it to be restored. The ability for shards to live on a different 
>> mount and soft-linked seems like a nice workaround for disk space, but that 
>> requires that everything is always mounted.
>>
>> Is there still interest? Is my implementation idea worth effort (in the 
>> future)?
>>
>> ~Matthew J.Turner
>
>
> From my point of view, this is a very desirable feature that has stalled not 
> because of non-interest but because of a lack of developer time. If you are 
> able to help out - I think that would be great!
>
> That said, there are some issues to consider as you can see in the Jira 
> issue. Most important it would probably invalidate any existing WCs.
>
> You say that you want to keep existing revision numbers and thus a dump/load 
> cycle doesn't help. I just tried the following which seemed to work for me, 
> except that it broke my WC so I had to check out a new one:
>
> C:\temp>svnadmin create repo2
> .. Create a new repo. My existing one is repo1
> C:\temp>svnadmin dump --include NonExistingPath --revision 1:3 repo1 | 
> svnadmin load repo2
> ... Dump the existing repo of the revisions I want to exclude, filtering out 
> everything so I basically get three empty revisions. Load them to the new repo
> C:\temp>svnadmin dump --revision 4:HEAD repo | svnadmin load repo2
> ... Load the rest of the revisions.
>
> I end up with a repo with the same revision numbers as in the original one, 
> but with everything looking as if it was created in revision 4.
>
> I don't really see a big win doing the above. Possibly if I had a repository 
> with a lot of binary files in revisions 1:3, then repo2 would be a lot 
> smaller only storing the state of the repository as in revision 4, however I 
> assume you would still keep repo1 around negating any storage savings. One of 
> Subversion's strong points, in comparison to Git, is that we don't even have 
> to care about these old revisions client-side. If I checkout the project @ 
> revision 4, my WC will only get the files as they were at that revision.
>
> Kind regards,
> Daniel
>

Agreed that this is still, after all these years / decades, considered
a very desirable feature. So I would certainly applaud anyone
investing time and effort into this. Apart from lack of developer
time, I'd add that another reason for the lack of progress in this
area is that it is really difficult. Difficult for several reasons,
among others:

- Diverse expectations of what "obliterate" looks like / what use
cases it is for (removing leaked secrets, recovering diskspace,
rolling back the most recent revision(s), more flexible "history
manipulation", ...). It might be a good idea to focus on a particular
use case (or set of related use cases), but it doesn't hurt to think
about the bigger picture or a bigger "framework" either.

- The assumption that "history is immutable" is quite fundamental in
the "Subversion system" (server, backend storage (FSFS), wire
protocol(s), client, wc-storage). Obliterate without creating a new
repo with new UUID is a form of history rewrite. Having working copies
gracefully handle some forms of uuid-preserving-obliterate would
certainly be nice, but right now the client / wc can't even detect it
(it will just break in all sorts of ways).

I'd say that any effort for obliterate should start with discussing
specs, desired features, and rough ideas of a design / how to fit it
into the existing system. These discussions may take some time, as
there is a lot to unravel and consider (years of prior discussions
etc), so if you want to go for it, please be patient. And then of
course, we'd someone(s) to actually implement it :-).

Some links to the most recent discussions on this list:

- March 2018: https://lists.apache.org/thread/7ogp6zqkt18svd6q2w9w6572vlmoo63k
(Script to obliterate the most recent revision(s))
Contains a proposal by Julian Foad for a script to obliterate the most
recent revision(s).
As I said in this thread (referring to a hackathon discussion in
2017): 'it would be wonderful if a client, acting on a working copy,
could detect that "history has been changed". Even if only to give an
fatal error message "your working copy is broken, please check out a
new one".' and 'What would be vastly better, I think, would be that
only "working copies containing traces of the changed
history" would be broken.'
Brane also mentioned the term "data generation ID" as an extra number
to track "history-changing operations".

- March 2019: https://lists.apache.org/thread/05gtv54g70rdxdkpl1j6t9r38ry1xb7t
(svn obliterate - more feasible these days?)
Goes a bit into the FSFS part, but also reiterates and summarizes part
of the previous discussion about "impact on existing working copies".


Cheers,
-- 
Johan

Reply via email to