Re: File versioning based on shallow Git repositories?
Hello Johannes, Johannes Schindelin writes: > On Fri, 13 Apr 2018, Jakub Narebski wrote: >> Hallvard Breien Furuseth writes: >> >>> Also maybe it'll be worthwhile to generate .git/info/grafts in a local >>> clone of the repo to get back easily visible history. No grafts in >>> the original repo, grafts mess things up. >> >> Just a reminder: modern Git has "git replace", a modern and safe >> alternative to the grafts file. > > Right! > > Maybe it is time to start deprecating grafts? They *do* cause problems, > such as weird "missing objects" problems when trying to fetch into, or > push from, a repository with grafts. These problems are not shared by the > `git replace` method. Also you can propagate "git replace" info with clone / fetch / push. > I just sent out a patch to add a deprecation warning. Thank you for this. -- Jakub Narębski
Re: File versioning based on shallow Git repositories?
Hi Kuba, On Fri, 13 Apr 2018, Jakub Narebski wrote: > Hallvard Breien Furuseth writes: > > > Also maybe it'll be worthwhile to generate .git/info/grafts in a local > > clone of the repo to get back easily visible history. No grafts in > > the original repo, grafts mess things up. > > Just a reminder: modern Git has "git replace", a modern and safe > alternative to the grafts file. Right! Maybe it is time to start deprecating grafts? They *do* cause problems, such as weird "missing objects" problems when trying to fetch into, or push from, a repository with grafts. These problems are not shared by the `git replace` method. I just sent out a patch to add a deprecation warning. Ciao, Dscho
Re: File versioning based on shallow Git repositories?
Hallvard Breien Furuseth writes: > Also maybe it'll be worthwhile to generate .git/info/grafts in a local > clone of the repo to get back easily visible history. No grafts in > the original repo, grafts mess things up. Just a reminder: modern Git has "git replace", a modern and safe alternative to the grafts file. Best, -- Jakub Narębski
Re: File versioning based on shallow Git repositories?
On 12. april 2018 23:07, Rafael Ascensao wrote: Would initiating a repo with a empty root commit, tag it with 'base' then use $ git rebase --onto base master@{30 days ago} master; be viable? No... my question was confused from the beginning. With such large files I _shouldn't_ have history (or grafts), otherwise Git spends a lot of CPU time creating diffs when I look at a commit, or worse, when I try git log. Which I discovered quickly when trying real data instead of test-data:-) Ævar's suggestion was exactly right in that respect. Thanks again! -- Hallvard
Re: File versioning based on shallow Git repositories?
Would initiating a repo with a empty root commit, tag it with 'base' then use $ git rebase --onto base master@{30 days ago} master; be viable? The --orphan & tag is perhaps more robust, since it's "harder" to move tags around. -- Rafael Ascensão
Re: File versioning based on shallow Git repositories?
On Thu, Apr 12 2018, Hallvard Breien Furuseth wrote: > On 12. april 2018 20:47, Ævar Arnfjörð Bjarmason wrote: >> 1. Create a backup.git repo >> 2. Each time you make a backup, checkout a new orphan branch, see "git >> checkout --orphan" >> 3. You copy the files over, commit them, "git log" at this point shows >> one commit no matter if you've done this before. >> 4. You create a tag for this backup, e.g. one named after the current >> time, delete the branch. >> 5. You then have a retention period for the tags, e.g. only keep the >> last 30 tags if you do daily backups for 30 days of backups. >> >> Then as soon as you delete the tags the old commit will be unreferenced, >> and you can make git-gc delete the data. > > Nice! > Why the tags though, instead of branches named after the current time? Because tags are idiomatic in git for a reference that doesn't change, but sure, if you'd like branches that'll work too. > One --orphan branch/tag per day with several commits would work for me. > > Also maybe it'll be worthwhile to generate .git/info/grafts in a local > clone of the repo to get back easily visible history. No grafts in > the original repo, grafts mess things up. Maybe, I have not tried this with grafts.
Re: File versioning based on shallow Git repositories?
On 12. april 2018 20:47, Ævar Arnfjörð Bjarmason wrote: 1. Create a backup.git repo 2. Each time you make a backup, checkout a new orphan branch, see "git checkout --orphan" 3. You copy the files over, commit them, "git log" at this point shows one commit no matter if you've done this before. 4. You create a tag for this backup, e.g. one named after the current time, delete the branch. 5. You then have a retention period for the tags, e.g. only keep the last 30 tags if you do daily backups for 30 days of backups. Then as soon as you delete the tags the old commit will be unreferenced, and you can make git-gc delete the data. Nice! Why the tags though, instead of branches named after the current time? One --orphan branch/tag per day with several commits would work for me. Also maybe it'll be worthwhile to generate .git/info/grafts in a local clone of the repo to get back easily visible history. No grafts in the original repo, grafts mess things up. -- Hallvard
Re: File versioning based on shallow Git repositories?
On Thu, Apr 12 2018, Hallvard Breien Furuseth wrote: > Can I use a shallow Git repo for file versioning, and regularly purge > history older than e.g. 2 weeks? Purged data MUST NOT be recoverable. > > Or is there a backup tool based on shallow Git cloning which does this? > Push/pull to another shallow repo would be nice but is not required. > The files are text files up to 1/4 Gb, usually with few changes. > > > If using Git - I see "git fetch --depth" can shorten history now. > How do I do that without 'fetch', in the origin repo? > Also Documentation/technical/shallow.txt describes some caveats, I'm > not sure how relevant they are. > > To purge old data - > git config core.logallrefupdates false > git gc --prune=now --aggressive > Anything else? > > I'm guessing that without --aggressive, some expired info might be > deduced from studying the packing of the remaining objects. Don't > know if we'll be required to be that paranoid. The shallow feature is not for this use-case, but there's a much easier solution that I've used for exactly this use-case, e.g. taking backups of SQL dumps that delta-compress well, and then throwing out old backups. You: 1. Create a backup.git repo 2. Each time you make a backup, checkout a new orphan branch, see "git checkout --orphan" 3. You copy the files over, commit them, "git log" at this point shows one commit no matter if you've done this before. 4. You create a tag for this backup, e.g. one named after the current time, delete the branch. 5. You then have a retention period for the tags, e.g. only keep the last 30 tags if you do daily backups for 30 days of backups. Then as soon as you delete the tags the old commit will be unreferenced, and you can make git-gc delete the data. You'll still be able to `git diff` between tags, even though they have unrelated histories, and the files will still delta-compress.