Re: [PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Junio C Hamano
On Sun, Apr 15, 2018 at 4:47 AM, Ævar Arnfjörð Bjarmason
 wrote:
>
> The only (trivial) issue I found in the patches themselves was that
> between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
> just to erase it in 5/7, better not to add it to begin with, but
> hopefully Junio can fix that up (if he cares).

I do care but I'd wish I do not have to waste time and concentration to spend
on doing so, even though I would be fully capable of skipping this round and
remembering to queue a rerolled one.

I've seen mentions like the above one a few times on the list recently, so let
me make it clear. Some things are easier to tweak locally than others, and
I'd rather not to waste my time on cleaning other people's mess. A simple
typofix that does not cascade through to later steps in a series is one thing.
A tweak that changes number of lines in a hunk that forces a later step to
compensate is more involved.

Don't expect your traffic cop to wash your care while you're stopping at a
red signal.


Re: [PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Ævar Arnfjörð Bjarmason

On Sat, Apr 14 2018, Nguyễn Thái Ngọc Duy wrote:

> This is basically a resend from the last round but rebased on
> 'master'. The old base results in some conflicts with packfile and oid
> conversion series. This one should reduce merge conflicts on 'pu' to
> trivial ones.

Thanks. I've been running this at work and as noted in
https://public-inbox.org/git/87vadpxv27@evledraar.gmail.com/ it's
had big performance impact to the better, users even started noticing it
(they'd previously get noticeable slowdowns while doing other task on
GC).

I also tried to see just how much worse this was making performance, my
hunch was that the difference should be trivial but noticeable since
we'll produce a less efficient pack.

What I found was the opposite, under real-world conditions it seems to
be making things 1-2% better on common git operations, which I suspect
is because once we've done a few pulls and coalesced those into their
own pack(s) there's more cache locality for the data we're actually
looking at.

I.e. once you've got a repo has a big pack you're not touching, and a
few weeks of updates from upstream that you've coalesced into another
pack there's a higher density of stuff you care about near HEAD per FS
page in the recent smaller pack, which if you're pressed for memory and
parts of your pack are getting paged out of the FS cache is a win. I
haven't confirmed that, it's just a hypothesis.

The only (trivial) issue I found in the patches themselves was that
between 4/5 and 5/7 you're adding an empty line to config.txt in 4/7
just to erase it in 5/7, better not to add it to begin with, but
hopefully Junio can fix that up (if he cares).


[PATCH 0/7] nd/repack-keep-pack update

2018-04-14 Thread Nguyễn Thái Ngọc Duy
This is basically a resend from the last round but rebased on
'master'. The old base results in some conflicts with packfile and oid
conversion series. This one should reduce merge conflicts on 'pu' to
trivial ones.

Nguyễn Thái Ngọc Duy (7):
  t7700: have closing quote of a test at the beginning of line
  repack: add --keep-pack option
  gc: add --keep-largest-pack option
  gc: add gc.bigPackThreshold config
  gc: handle a corner case in gc.bigPackThreshold
  gc --auto: exclude base pack if not enough mem to "repack -ad"
  pack-objects: show some progress when counting kept objects

 Documentation/config.txt   |  12 +++
 Documentation/git-gc.txt   |  19 +++-
 Documentation/git-pack-objects.txt |   9 +-
 Documentation/git-repack.txt   |   9 +-
 builtin/gc.c   | 165 +++--
 builtin/pack-objects.c |  83 +++
 builtin/repack.c   |  21 +++-
 config.mak.uname   |   1 +
 git-compat-util.h  |   4 +
 object-store.h |   1 +
 pack-objects.h |   2 +
 t/t6500-gc.sh  |  32 ++
 t/t7700-repack.sh  |  27 -
 13 files changed, 349 insertions(+), 36 deletions(-)

-- 
2.17.0.367.g5dd2e386c3