Re: [PATCH v3] gc: ignore old gc.log files

Junio C Hamano Fri, 10 Feb 2017 11:48:00 -0800

David Turner <[email protected]> writes:

> It would be good if automatic gc were useful to server operators.
> A server can end up in a state whre there are
> lots of unreferenced loose objects (say, because many users are doing
> a bunch of rebasing and pushing their rebased branches). Before this
> patch, this state would cause a gc.log file to be created, preventing
> future auto gcs.  Then pack files could pile up.  Since many git
> operations are O(n) in the number of pack files, this would lead to
> poor performance.  Now, the pack files will get cleaned up, if
> necessary, at least once per day.  And operators who find a need for
> more-frequent gcs can adjust gc.logExpiry to meet their needs.
>
> Git should never get itself into a state where it refuses to do any
> maintenance, just because at some point some piece of the maintenance
> didn't make progress.
>
> In this commit, git learns to ignore gc.log files which are older than
> (by default) one day old.  It also learns about a config, gc.logExpiry
> to manage this.  There is also some cleanup: a successful manual gc,
> or a warning-free auto gc with an old log file, will remove any old
> gc.log files.
>
> It might still happen that manual intervention is required
> (e.g. because the repo is corrupt), but at the very least it won't be
> because Git is too dumb to try again.


Thanks, nicely explained.

> +     if (fstat(get_lock_file_fd(&log_lock), &st)) {
> +             if (errno == ENOENT) {
> +                     /*
> +                      * The user has probably intentionally deleted
> +                      * gc.log.lock (perhaps because they're blowing
> +                      * away the whole repo), so thre's no need to
> +                      * report anything here.  But we also won't
> +                      * delete gc.log, because we don't know what
> +                      * the user's intentions are.
> +                      */

OK.

> +             } else {
> +                     FILE *fp;
> +                     int fd;
> +                     int saved_errno = errno;
> +                     /*
> +                      * Perhaps there was an i/o error or another
> +                      * unlikely situation.  Try to make a note of
> +                      * this in gc.log.  If this fails again,
> +                      * give up and leave gc.log as it was.
> +                      */
> +                     rollback_lock_file(&log_lock);
> +                     fd = hold_lock_file_for_update(&log_lock,
> +                                                    git_path("gc.log"),
> +                                                    LOCK_DIE_ON_ERROR);
> +
> +                     fp = fdopen(fd, "w");
> +                     fprintf(fp, _("Failed to fstat %s: %s"),
> +                             get_tempfile_path(&log_lock.tempfile),
> +                             strerror(errno));

I think you meant to use saved_errno in this message and then

> +                     fclose(fp);
> +                     commit_lock_file(&log_lock);

possibly assign it back to errno around here?

> +             }
> +
> +     } else if (st.st_size) {
> +             /* There was some error recorded in the lock file */
>               commit_lock_file(&log_lock);
> -     else
> +     } else {
> +             /* No error, clean up any old gc.log */
> +             unlink(git_path("gc.log"));
>               rollback_lock_file(&log_lock);
> +     }
>  }

> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index 1762dfa6a..84ad07eb2 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -67,5 +67,18 @@ test_expect_success 'auto gc with too many loose objects 
> does not attempt to cre
>       test_line_count = 2 new # There is one new pack and its .idx
>  '
>  
> +test_expect_success 'background auto gc does not run if gc.log is present 
> and recent but does if it is old' '
> +     keep=$(ls .git/objects/pack/*.pack|head -1|sed -e "s/pack$/keep/") &&

You could save one process with:

    ls .git/objects/pack/*.pack | sed -e "s/pack$/keep/" -e q

but do you even need $keep?  I do not see it used below.

> +     test_commit foo &&
> +     test_commit bar &&
> +     git repack &&
> +     test_config gc.autopacklimit 1 &&
> +     test_config gc.autodetach true &&
> +     echo fleem >.git/gc.log &&
> +     test_must_fail git gc --auto 2>err &&
> +     test_i18ngrep "^error:" err &&
> +     test-chmtime =-86401 .git/gc.log &&
> +     git gc --auto
> +'
>  
>  test_done

Other than that I didn't spot anything suspicious.  I'll wait to see
what others may want to add.

Thanks.

Re: [PATCH v3] gc: ignore old gc.log files

Reply via email to