On Mon, 22 Apr 2024, Mark Wielaard wrote: > > A system that uses git as the source of > > truth for all the pull request data and has refs through which all this > > can be located (with reasonably straightforward, documented formats for > > the data, not too closely tied to any particular implementation of a > > pull-request system), so that a single clone --mirror has all the data, > > might be suitable (people have worked on ensuring git scales well with > > very large numbers of refs, which you'd probably get in such a system > > storing all the data in git); > > Yes, git is pretty nice for storing lots of variants of somewhat > identical sources/texts. But this also seems to imply that when we > offer a system to store "contributor" git trees/forks of projects to > easily create "pull requests" then we can never remove such users/forks > and must disallow rebasing any trees that have been "submitted".
For example, GitHub has some version of the source branch for a pull request under refs/pull/ in the target respository - that doesn't rely on the source branch or repository staying around. However, that's only one version - it doesn't work so well when the source branch is rebased (though GitHub itself is reported to keep all forks of a repository in a single repository internally, rarely garbage collected, so the previous versions probably remain there, just not accessible from any ref). But you could certainly have a convention for ref naming that ensures all versions of a PR are available even when it's rebased. Things like the "git evolve" proposal <https://lwn.net/Articles/914041/> could also be relevant (maybe that particular proposal wasn't intended for the goal of ensuring all submitted versions of a change remain permanently available, but at least it's dealing with a similar problem - and the more you have a standard way of representing this kind of information in git, rather than something very specific to a particular system built on top of git, the better). -- Joseph S. Myers josmy...@redhat.com