Re: [PATCH] gc: do not warn about too many loose objects

Jeff King Mon, 16 Jul 2018 12:42:01 -0700

On Mon, Jul 16, 2018 at 12:09:49PM -0700, Jonathan Nieder wrote:

> >>> So while I completely agree that it's not a great thing to encourage
> >>> users to blindly run "git prune", I think it _is_ actionable.
> >>
> >> To flesh this out a little more: what user action do you suggest?  Could
> >> we carry out that action automatically?
> >
> > Er, the action is to run "git prune", like the warning says. :)
> 
> I don't think we want to recommend that, especially when "git gc --auto"
> does the right thing automatically.


But that's the point. This warning is written literally after running
"git gc --auto" _didn't_ do the right thing. Yes, it would be nicer if
it could do the right thing. But it doesn't yet know how to.

See the thread I linked earlier on putting unreachable objects into a
pack, which I think is the real solution.

> > The warning that is deleted by this patch is: you _just_ ran gc, and hey
> > we even did it automatically for you, but we're still in a funky state
> > afterwards. You might need to clean up this state.
> 
> This sounds awful.  It sounds to me like you're saying "git gc --auto"
> is saying "I just did the wrong thing, and here is how you can clean
> up after me".  That's not how I want a program to behave.

Sure, it would be nice if it did the right thing. Nobody has written
that yet. Until they do, we have to deal with the fallout.

> > If you do that without anything further, then it will break the
> > protection against repeatedly running auto-gc, as I described in the
> > previous email.
> 
> Do you mean ratelimiting for the message, or do you actually mean
> repeatedly running auto-gc itself?
> 
> If we suppress warnings, there would still be a gc.log while "git gc
> --auto" is running, just as though there had been no warnings at all.
> I believe this is close to the intended behavior; it's the same as
> what you'd get without daemon mode, except that you lose the warning.

I mean that if you do not write a persistent log, then "gc --auto" will
do an unproductive gc every time it is invoked. I.e., it will see "oh,
there are too many loose objects", and then waste a bunch of CPU every
time you commit.

> > Any of those would work similarly to the "just detect warnings" I
> > suggested earlier, with respect to keeping the "1 day" expiration
> > intact, so I'd be OK with them. In theory they'd be more robust than
> > scraping the "warning:" prefix. But in practice, I think you have to
> > resort to scraping anyway, if you are interested in treating warnings
> > from sub-processes the same way.
> 
> Can you say more about this for me?  I am not understanding what
> you're saying necessitates scraping the output.  I would strongly
> prefer to avoid scraping the output.

A daemonized git-gc runs a bunch of sub-commands (e.g., "git prune")
with their stderr redirected into the logfile. If you want to have
warnings go somewhere else, then either:

  - you need some way to tell those sub-commands to send the warnings
    elsewhere (i.e., _not_ stderr)

    or

  - you have to post-process the output they send to separate warnings
    from other errors. Right now that means scraping. If you are
    proposing a system of machine-readable output, it would need to work
    not just for git-gc, but for every sub-command it runs.

-Peff

Re: [PATCH] gc: do not warn about too many loose objects

Reply via email to