On 02/09/2018 11:30 PM, Jeff King wrote: > On Fri, Feb 09, 2018 at 11:04:17PM +0100, Ævar Arnfjörð Bjarmason wrote: >> One thing that's not discussed yet, and I know just enough about for it >> to tingle my spidey sense, but not enough to say for sure (CC'd Jeff & >> Brandon who know more) is that this feature once shipped might cause >> higher load on git hosting providers. >> >> This is because people will inevitably use it in popular projects for >> some custom filtering, and because you're continually re-fetching and >> inspecting stuff what used to be a really cheap no-op "pull" most of the >> time is a more expensive negotiation every time before the client >> rejects the refs again, and worse for hosting providers because you have >> bespoke ref fetching strategies you have less odds of being able to >> cache both the negotiation and the pack you serve. > > Most of the discussion so far seems to be about "accept this ref or > don't accept this ref", which seems OK. But if you are going to do > custom tweaking like rewriting objects, or making it common to refuse > some refs, then I think things get pretty inefficient for _everybody_. > > The negotiation for future fetches uses the existing refs as the > starting point. And if we don't know that we have the objects because > there are no refs pointing at them, they're going to get transferred > again. That's extra load no the server, and extra time for the user > waiting on the network.
Oh. I thought the protocol git used was something like client: I want to fetch refs A and B server: so you'll need objects 12345678 and 90ABCDEF, A and B both point to 12345678 client: please give me object 12345678 server: here it is [...] I was clearly wrong, thanks! (and thanks Ævar for your explanation in the side-thread, too!) > I tend to agree with the direction of thinking you outlined: you're > generally better off completing the fetch to a local namespace that > tracks the other side completely, and then manipulating the local refs > as you see fit (e.g., fetching into refs/quarantine, and then migrating > "good" refs over to refs/remotes/origin). Hmm... so do I understand it correctly when I say the process you're thinking about works like this? * User installs hook for my-remote by running [something] * User runs git fetch * git fetch fetches remote refs/heads/* to local refs/quarantine/* (so I guess [something] changes the remote.my-remote.fetch refmap) * When this is done `git fetch` runs a notification-only post-fetch hook (that would need to be added) * The post-fetch hook then performs whatever it wants and updates the references in refs/remotes/my-remote/* So the changes that are required are: * Adding a notification-only post-fetch hook * For handling tags, there is a need to have a refmap for tags. Maybe adding a remote.my-remote.fetchTags refmap, that would be used when running with --tags, and having it default to “refs/tags/*:refs/tags/*” to keep the current behavior by default? The only remaining issue I can think of is: How do we avoid the issue of the trigger-only-hook-inciting-bad-behavior-by-hook-authors-who-really-want-modification raised in the side-thread that Junio wrote in ? Maybe just writing in the documentation that the hook should use a quarantine-like approach if it wants modification would be enough to not have hook authors try to modify the ref in the post-fetch hook? Thanks for all your thoughts, and hope we're getting somewhere! Leo PS: I'll read over the reviews once I'm all clear as to what exactly is wanted for this patch, as most likely they'll just be dumped, given the current state of affairs.  https://marc.info/?l=git&m=132480559712592&w=2