David Turner <[email protected]> writes:
> It would be good if automatic gc were useful to server operators.
> A server can end up in a state whre there are
> lots of unreferenced loose objects (say, because many users are doing
> a bunch of rebasing and pushing their rebased branches). Before this
> patch, this state would cause a gc.log file to be created, preventing
> future auto gcs. Then pack files could pile up. Since many git
> operations are O(n) in the number of pack files, this would lead to
> poor performance. Now, the pack files will get cleaned up, if
> necessary, at least once per day. And operators who find a need for
> more-frequent gcs can adjust gc.logExpiry to meet their needs.
>
> Git should never get itself into a state where it refuses to do any
> maintenance, just because at some point some piece of the maintenance
> didn't make progress.
>
> In this commit, git learns to ignore gc.log files which are older than
> (by default) one day old. It also learns about a config, gc.logExpiry
> to manage this. There is also some cleanup: a successful manual gc,
> or a warning-free auto gc with an old log file, will remove any old
> gc.log files.
>
> It might still happen that manual intervention is required
> (e.g. because the repo is corrupt), but at the very least it won't be
> because Git is too dumb to try again.
Thanks, nicely explained.
> + if (fstat(get_lock_file_fd(&log_lock), &st)) {
> + if (errno == ENOENT) {
> + /*
> + * The user has probably intentionally deleted
> + * gc.log.lock (perhaps because they're blowing
> + * away the whole repo), so thre's no need to
> + * report anything here. But we also won't
> + * delete gc.log, because we don't know what
> + * the user's intentions are.
> + */
OK.
> + } else {
> + FILE *fp;
> + int fd;
> + int saved_errno = errno;
> + /*
> + * Perhaps there was an i/o error or another
> + * unlikely situation. Try to make a note of
> + * this in gc.log. If this fails again,
> + * give up and leave gc.log as it was.
> + */
> + rollback_lock_file(&log_lock);
> + fd = hold_lock_file_for_update(&log_lock,
> + git_path("gc.log"),
> + LOCK_DIE_ON_ERROR);
> +
> + fp = fdopen(fd, "w");
> + fprintf(fp, _("Failed to fstat %s: %s"),
> + get_tempfile_path(&log_lock.tempfile),
> + strerror(errno));
I think you meant to use saved_errno in this message and then
> + fclose(fp);
> + commit_lock_file(&log_lock);
possibly assign it back to errno around here?
> + }
> +
> + } else if (st.st_size) {
> + /* There was some error recorded in the lock file */
> commit_lock_file(&log_lock);
> - else
> + } else {
> + /* No error, clean up any old gc.log */
> + unlink(git_path("gc.log"));
> rollback_lock_file(&log_lock);
> + }
> }
> diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
> index 1762dfa6a..84ad07eb2 100755
> --- a/t/t6500-gc.sh
> +++ b/t/t6500-gc.sh
> @@ -67,5 +67,18 @@ test_expect_success 'auto gc with too many loose objects
> does not attempt to cre
> test_line_count = 2 new # There is one new pack and its .idx
> '
>
> +test_expect_success 'background auto gc does not run if gc.log is present
> and recent but does if it is old' '
> + keep=$(ls .git/objects/pack/*.pack|head -1|sed -e "s/pack$/keep/") &&
You could save one process with:
ls .git/objects/pack/*.pack | sed -e "s/pack$/keep/" -e q
but do you even need $keep? I do not see it used below.
> + test_commit foo &&
> + test_commit bar &&
> + git repack &&
> + test_config gc.autopacklimit 1 &&
> + test_config gc.autodetach true &&
> + echo fleem >.git/gc.log &&
> + test_must_fail git gc --auto 2>err &&
> + test_i18ngrep "^error:" err &&
> + test-chmtime =-86401 .git/gc.log &&
> + git gc --auto
> +'
>
> test_done
Other than that I didn't spot anything suspicious. I'll wait to see
what others may want to add.
Thanks.