Re: Rebase performance

Ævar Arnfjörð Bjarmason Thu, 25 Feb 2016 08:32:24 -0800

On Wed, Feb 24, 2016 at 11:09 PM, Christian Couder
<christian.cou...@gmail.com> wrote:


[Resent because I was accidentally in GMail's HTML mode and the ML rejected it]

> If there was a config option called maybe "rebase.taskset" or
> "rebase.setcpuaffinity" that could be set to ask the OS for all the
> rebase child processes to be run on the same core, people who run many
> rebases on big repos on big servers as we do at Booking.com could
> easily benefit from a nice speed up.
>
> Technically the option may make git-rebase--am.sh call "git am" using
> "taskset" (if taskset is available on the current OS).

I think aside from issues with git-apply this would be an interesting
feature to have in git. I.e. some general facility to intercept
commands and inject a prefix command in front of them, whether that's
taskset, nice/ionice, strace etc.

> Another possibility would be to libify the "git apply" functionality
> and then to use the libified "git apply" in run_apply() instead of
> launching a separate "git apply" process. One benefit from this is
> that we could probably get rid of the read_cache_from() call at the
> end of run_apply() and this would likely further speed up things. Also
> avoiding to launch separate processes might be a win especially on
> Windows.

Yeah that should help in this particular case and make the taskset
redundant since the whole sequence of operations would all be on one
core, right?

At the risk of derailing this thread, a thing that would make rebase
even faster I think would be to change it so that instead of applying
a patch at a time to the working tree the whole operation takes place
on temporary trees & commits and then we'll eventually move the branch
pointer to that once it's finished.

I.e. there's no reason for why a sequence of 1000 patches where a
FOO.txt is changed from "hi1", "hi2", "hi3", ... would be noticeably
slower than applying the same changes with git-fast-import.

Of course this would require a lot of nuances, e.g. if there's a
conflict we'd need to change the working tree & index as we do now
before continuing.

Has anyone looked into some advanced refactoring of the rebase process
that would work like this, or has some feedback on why this would be
dumb or that there's a better way to do it?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: Rebase performance

Reply via email to