Sun Chao (my former colleague at Huawei) found a bug of
git-pack-redundant.  If there are too many packs and many of them
overlap each other, running `git pack-redundant --all` will
exhaust all memories and the process will be killed by kernel.

There is a script in commit log of commit 3/6, which can be used to
create a repository with lots of redundant packs. Running `git
pack-redundant --all` in it can reproduce this issue.

## Changes since re-roll v9

Eric Sunshine <sunsh...@sunshineco.com> 于2019年2月2日周六 上午3:43写道:
>
> On Fri, Feb 1, 2019 at 11:22 AM Jiang Xin <worldhello....@gmail.com> wrote:
> > Add test cases for git pack-redundant to validate new algorithm for git
> > pack-redundant.
> >
> > Signed-off-by: Jiang Xin <zhiyou...@alibaba-inc.com>
> > ---
> > diff --git a/t/t5323-pack-redundant.sh b/t/t5323-pack-redundant.sh
> > @@ -0,0 +1,510 @@
> > +# Note: DO NOT run it in a subshell, otherwise the variables will not be 
> > set
>
> Which variables won't be set? It's not clear what this restriction is about.
>
> > +# Usage: create_commits_in <repo> A B C ...
> > +create_commits_in () {
> > +       repo="$1" &&
> > +       parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null) || parent=
>
> Broken &&-chain. Instead, perhaps:
>
>     if ! parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null)
>     then
>         parent=
>     fi &&
>
> or something simpler.

Fixed.

> > +       T=$(git -C "$repo" write-tree) &&
> > +       shift &&
> > +       while test $# -gt 0
> > +       do
> > +               name=$1 &&
> > +               test_tick &&
> > +               if test -z "$parent"
> > +               then
> > +                       oid=$(echo $name | git -C "$repo" commit-tree $T)
> > +               else
> > +                       oid=$(echo $name | git -C "$repo" commit-tree -p 
> > $parent $T)
> > +               fi &&
> > +               eval $name=$oid &&
> > +               parent=$oid &&
> > +               shift ||
> > +               return 1
> > +       done
>
> Broken &&-chain. Use:
>
>     done &&
>

Fixed, thanks.

> > It would have been easier for you to grok if the note instead said
> > "this function sets two global shell variables" or something,
> > perhaps?  Such a variable would certainly not be visible if this
> > function is called inside a subshell to the main process.
>
> Yes, better function comments would facilitate comprehension both for
> the reviewer and those working on the code in the future. For
> instance:
>
>     # Create commit for each argument [...with blah properties...] and
>     # assign [...] to shell variable of same name as argument.
>     # NOTE: Avoid calling this function from a subshell since variable
>     # assignments will disappear when subshell exits.

Polished comments for `create_commits_in` and `create_pack_in` helper
function.

>     create_pack_4 () {
>         ...
>         eval P$P4=P4:$P4
>     }
>    ...
>    test_expect_success 'create pack 4, 5' '
>         create_pack_4 && create_pack_5
>     '
>
> I haven't been able to convince myself that this helps readability --
> especially since the function definition is often far removed from the
> single point of use -- over merely inlining the function body directly
> in the sole test which calls it.

Use a new helper function `create_pack_in` to create packs near test 
functions.


## Range diff since v9:

1:  c8dbf8cef2 ! 1:  4719043603 t5323: test cases for git-pack-redundant
    @@ -43,8 +43,8 @@
     +    ALL | x x x x x x x x x x x x x x x x x x x
     +
     +Another repository `shared.git` has unique objects (X-Z), while other 
objects
    -+(marked with letter s) are shared through alt-odb (of `master.git`). The
    -+relationship between packs and objects is as follows:
    ++(marked with letter s) can be found in the shared alt-odb (of 
`master.git`).
    ++The relationship between packs and objects is as follows:
     +
     +        | T A B C D E F G H I J K L M N O P Q R   X Y Z
     +    ----+----------------------------------------------
    @@ -57,11 +57,19 @@
     +master_repo=master.git
     +shared_repo=shared.git
     +
    -+# Note: DO NOT run it in a subshell, otherwise the variables will not be 
set
    -+# Usage: create_commits_in <repo> A B C ...
    ++# Create commits in <repo> and assign each commit's oid to shell variables
    ++# given in the arguments (A, B, and C). E.g.:
    ++#
    ++#     create_commits_in <repo> A B C
    ++#
    ++# NOTE: Avoid calling this function from a subshell since variable
    ++# assignments will disappear when subshell exits.
     +create_commits_in () {
     +  repo="$1" &&
    -+  parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null) || parent=
    ++  if ! parent=$(git -C "$repo" rev-parse HEAD^{} 2>/dev/null)
    ++  then
    ++          parent=
    ++  fi &&
     +  T=$(git -C "$repo" write-tree) &&
     +  shift &&
     +  while test $# -gt 0
    @@ -78,101 +86,26 @@
     +          parent=$oid &&
     +          shift ||
     +          return 1
    -+  done
    ++  done &&
     +  git -C "$repo" update-ref refs/heads/master $oid
     +}
     +
    -+# Note: DO NOT run it in a subshell, otherwise the variables will not be 
set
    -+create_pack_1 () {
    -+  P1=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $T
    -+          $A
    -+          $B
    -+          $C
    -+          $D
    -+          $E
    -+          $F
    -+          $R
    -+          EOF
    -+  ) &&
    -+  eval P$P1=P1:$P1
    -+}
    -+
    -+create_pack_2 () {
    -+  P2=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $B
    -+          $C
    -+          $D
    -+          $E
    -+          $G
    -+          $H
    -+          $I
    -+          EOF
    -+  ) &&
    -+  eval P$P2=P2:$P2
    -+}
    -+
    -+create_pack_3 () {
    -+  P3=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $F
    -+          $I
    -+          $J
    -+          $K
    -+          $L
    -+          $M
    -+          EOF
    -+  ) &&
    -+  eval P$P3=P3:$P3
    -+}
    -+
    -+create_pack_4 () {
    -+  P4=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $J
    -+          $K
    -+          $L
    -+          $M
    -+          $P
    -+          EOF
    -+  ) &&
    -+  eval P$P4=P4:$P4
    -+}
    -+
    -+create_pack_5 () {
    -+  P5=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $G
    -+          $H
    -+          $N
    -+          $O
    -+          EOF
    -+  ) &&
    -+  eval P$P5=P5:$P5
    -+}
    -+
    -+create_pack_6 () {
    -+  P6=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $N
    -+          $O
    -+          $Q
    -+          EOF
    -+  ) &&
    -+  eval P$P6=P6:$P6
    -+}
    -+
    -+create_pack_7 () {
    -+  P7=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $P
    -+          $Q
    -+          EOF
    -+  ) &&
    -+  eval P$P7=P7:$P7
    -+}
    -+
    -+create_pack_8 () {
    -+  P8=$(git -C "$master_repo/objects/pack" pack-objects -q pack <<-EOF
    -+          $A
    -+          EOF
    -+  ) &&
    -+  eval P$P8=P8:$P8
    ++# Create pack in <repo> and assign pack id to variable given in the 2nd 
argument
    ++# (<name>). Commits in the pack will be read from stdin. E.g.:
    ++#
    ++#     create_pack_in <repo> <name> <<-EOF
    ++#         ...
    ++#         EOF
    ++#
    ++# NOTE: commits from stdin should be given using heredoc, not using pipe, 
and
    ++# avoid calling this function from a subshell since variable assignments 
will
    ++# disappear when subshell exits.
    ++create_pack_in () {
    ++  repo="$1" &&
    ++  name="$2" &&
    ++  pack=$(git -C "$repo/objects/pack" pack-objects -q pack) &&
    ++  eval $name=$pack &&
    ++  eval P$pack=$name:$pack
     +}
     +
     +format_packfiles () {
    @@ -209,8 +142,34 @@
     +#     ALL | x x x x x x x x x x x x x x         x
     +#
     
+#############################################################################
    -+test_expect_success 'no redundant for pack 1, 2, 3' '
    -+  create_pack_1 && create_pack_2 && create_pack_3 &&
    ++test_expect_success 'master: no redundant for pack 1, 2, 3' '
    ++  create_pack_in "$master_repo" P1 <<-EOF &&
    ++          $T
    ++          $A
    ++          $B
    ++          $C
    ++          $D
    ++          $E
    ++          $F
    ++          $R
    ++          EOF
    ++  create_pack_in "$master_repo" P2 <<-EOF &&
    ++          $B
    ++          $C
    ++          $D
    ++          $E
    ++          $G
    ++          $H
    ++          $I
    ++          EOF
    ++  create_pack_in "$master_repo" P3 <<-EOF &&
    ++          $F
    ++          $I
    ++          $J
    ++          $K
    ++          $L
    ++          $M
    ++          EOF
     +  (
     +          cd "$master_repo" &&
     +          git pack-redundant --all >out &&
    @@ -218,10 +177,6 @@
     +  )
     +'
     +
    -+test_expect_success 'create pack 4, 5' '
    -+  create_pack_4 && create_pack_5
    -+'
    -+
     
+#############################################################################
     +# Chart of packs and objects for this test case
     +#
    @@ -236,7 +191,20 @@
     +#     ALL | x x x x x x x x x x x x x x x x x   x
     +#
     
+#############################################################################
    -+test_expect_success 'one of pack-2/pack-3 is redundant' '
    ++test_expect_success 'master: one of pack-2/pack-3 is redundant' '
    ++  create_pack_in "$master_repo" P4 <<-EOF &&
    ++          $J
    ++          $K
    ++          $L
    ++          $M
    ++          $P
    ++          EOF
    ++  create_pack_in "$master_repo" P5 <<-EOF &&
    ++          $G
    ++          $H
    ++          $N
    ++          $O
    ++          EOF
     +  (
     +          cd "$master_repo" &&
     +          cat >expect <<-EOF &&
    @@ -248,10 +216,6 @@
     +  )
     +'
     +
    -+test_expect_success 'create pack 6, 7' '
    -+  create_pack_6 && create_pack_7
    -+'
    -+
     
+#############################################################################
     +# Chart of packs and objects for this test case
     +#
    @@ -268,7 +232,16 @@
     +#     ALL | x x x x x x x x x x x x x x x x x x x
     +#
     
+#############################################################################
    -+test_expect_success 'pack 2, 4, and 6 are redundant' '
    ++test_expect_success 'master: pack 2, 4, and 6 are redundant' '
    ++  create_pack_in "$master_repo" P6 <<-EOF &&
    ++          $N
    ++          $O
    ++          $Q
    ++          EOF
    ++  create_pack_in "$master_repo" P7 <<-EOF &&
    ++          $P
    ++          $Q
    ++          EOF
     +  (
     +          cd "$master_repo" &&
     +          cat >expect <<-EOF &&
    @@ -282,10 +255,6 @@
     +  )
     +'
     +
    -+test_expect_success 'create pack 8' '
    -+  create_pack_8
    -+'
    -+
     
+#############################################################################
     +# Chart of packs and objects for this test case
     +#
    @@ -303,7 +272,10 @@
     +#     ALL | x x x x x x x x x x x x x x x x x x x
     +#
     
+#############################################################################
    -+test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
    ++test_expect_success 'master: pack-8 (subset of pack-1) is also redundant' 
'
    ++  create_pack_in "$master_repo" P8 <<-EOF &&
    ++          $A
    ++          EOF
     +  (
     +          cd "$master_repo" &&
     +          cat >expect <<-EOF &&
    @@ -318,7 +290,7 @@
     +  )
     +'
     +
    -+test_expect_success 'clean loose objects' '
    ++test_expect_success 'master: clean loose objects' '
     +  (
     +          cd "$master_repo" &&
     +          git prune-packed &&
    @@ -327,7 +299,7 @@
     +  )
     +'
     +
    -+test_expect_success 'remove redundant packs and pass fsck' '
    ++test_expect_success 'master: remove redundant packs and pass fsck' '
     +  (
     +          cd "$master_repo" &&
     +          git pack-redundant --all | xargs rm &&
    @@ -347,7 +319,7 @@
     +  )
     +'
     +
    -+test_expect_success 'no redundant packs without --alt-odb' '
    ++test_expect_success 'shared: all packs are redundant, but no output 
without --alt-odb' '
     +  (
     +          cd "$shared_repo" &&
     +          git pack-redundant --all >out &&
    @@ -380,7 +352,7 @@
     +#     ALL | x x x x x x x x x x x x x x x x x x x
     +#
     
+#############################################################################
    -+test_expect_success 'pack-redundant --verbose: show duplicate packs in 
stderr' '
    ++test_expect_success 'shared: show redundant packs in stderr for verbose 
mode' '
     +  (
     +          cd "$shared_repo" &&
     +          cat >expect <<-EOF &&
    @@ -396,7 +368,7 @@
     +  )
     +'
     +
    -+test_expect_success 'remove redundant packs by alt-odb, no packs left' '
    ++test_expect_success 'shared: remove redundant packs, no packs left' '
     +  (
     +          cd "$shared_repo" &&
     +          cat >expect <<-EOF &&
    @@ -409,10 +381,9 @@
     +  )
     +'
     +
    -+# Note: DO NOT run function `create_pack_*` in sub shell, or variables 
are not set
    -+create_pack_x1_in () {
    -+  repo="$1" &&
    -+  Px1=$(git -C "$repo/objects/pack" pack-objects -q pack <<-EOF
    ++test_expect_success 'shared: create new objects and packs' '
    ++  create_commits_in "$shared_repo" X Y Z &&
    ++  create_pack_in "$shared_repo" Px1 <<-EOF &&
     +          $X
     +          $Y
     +          $Z
    @@ -420,13 +391,7 @@
     +          $B
     +          $C
     +          EOF
    -+  ) &&
    -+  eval P${Px1}=Px1:${Px1}
    -+}
    -+
    -+create_pack_x2_in () {
    -+  repo="$1" &&
    -+  Px2=$(git -C "$repo/objects/pack" pack-objects -q pack <<-EOF
    ++  create_pack_in "$shared_repo" Px2 <<-EOF
     +          $X
     +          $Y
     +          $Z
    @@ -434,17 +399,9 @@
     +          $E
     +          $F
     +          EOF
    -+  ) &&
    -+  eval P${Px2}=Px2:${Px2}
    -+}
    -+
    -+test_expect_success 'create new objects and packs in shared.git' '
    -+  create_commits_in "$shared_repo" X Y Z &&
    -+  create_pack_x1_in "$shared_repo" &&
    -+  create_pack_x2_in "$shared_repo"
     +'
     +
    -+test_expect_success 'no redundant without --alt-odb' '
    ++test_expect_success 'shared: no redundant without --alt-odb' '
     +  (
     +          cd "$shared_repo" &&
     +          git pack-redundant --all >out &&
    @@ -475,7 +432,7 @@
     +#     ALL | s s s s s s s s s s s s s s s s s s s   x x x
     +#
     
+#############################################################################
    -+test_expect_success 'one pack is redundant' '
    ++test_expect_success 'shared: one pack is redundant with --alt-odb' '
     +  (
     +          cd "$shared_repo" &&
     +          git pack-redundant --all --alt-odb >out &&
    @@ -508,7 +465,7 @@
     +#                                                  (ignored objects, 
marked with i)
     +#
     
+#############################################################################
    -+test_expect_success 'set ignore objects and all two packs are redundant' '
    ++test_expect_success 'shared: ignore unique objects and all two packs are 
redundant' '
     +  (
     +          cd "$shared_repo" &&
     +          cat >expect <<-EOF &&
2:  a6300516d7 = 2:  4feb1eaa40 pack-redundant: delay creation of unique_objects
3:  fb71973df5 = 3:  875367d7b4 pack-redundant: delete redundant code
4:  9963d1c49f ! 4:  50cb2854f1 pack-redundant: new algorithm to find min packs
    @@ -331,35 +331,35 @@
      #     ALL | x x x x x x x x x x x x x x x x x   x
      #
      
#############################################################################
    --test_expect_success 'one of pack-2/pack-3 is redundant' '
    -+test_expect_failure 'one of pack-2/pack-3 is redundant (failed on Mac)' '
    -   (
    -           cd "$master_repo" &&
    -           cat >expect <<-EOF &&
    +-test_expect_success 'master: one of pack-2/pack-3 is redundant' '
    ++test_expect_failure 'master: one of pack-2/pack-3 is redundant (failed on 
Mac)' '
    +   create_pack_in "$master_repo" P4 <<-EOF &&
    +           $J
    +           $K
     @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_success 'pack 2, 4, and 6 are redundant' '
    -+test_expect_failure 'pack 2, 4, and 6 are redundant (failed on Mac)' '
    -   (
    -           cd "$master_repo" &&
    -           cat >expect <<-EOF &&
    +-test_expect_success 'master: pack 2, 4, and 6 are redundant' '
    ++test_expect_failure 'master: pack 2, 4, and 6 are redundant (failed on 
Mac)' '
    +   create_pack_in "$master_repo" P6 <<-EOF &&
    +           $N
    +           $O
     @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
    -+test_expect_failure 'pack-8 (subset of pack-1) is also redundant (failed 
on Mac)' '
    -   (
    -           cd "$master_repo" &&
    -           cat >expect <<-EOF &&
    +-test_expect_success 'master: pack-8 (subset of pack-1) is also redundant' 
'
    ++test_expect_failure 'master: pack-8 (subset of pack-1) is also redundant 
(failed on Mac)' '
    +   create_pack_in "$master_repo" P8 <<-EOF &&
    +           $A
    +           EOF
     @@
        )
      '
      
    --test_expect_success 'remove redundant packs and pass fsck' '
    -+test_expect_failure 'remove redundant packs and pass fsck (failed on 
Mac)' '
    +-test_expect_success 'master: remove redundant packs and pass fsck' '
    ++test_expect_failure 'master: remove redundant packs and pass fsck (failed 
on Mac)' '
        (
                cd "$master_repo" &&
                git pack-redundant --all | xargs rm &&
    @@ -367,8 +367,8 @@
        )
      '
      
    --test_expect_success 'no redundant packs without --alt-odb' '
    -+test_expect_failure 'no redundant packs without --alt-odb (failed on 
Mac)' '
    +-test_expect_success 'shared: all packs are redundant, but no output 
without --alt-odb' '
    ++test_expect_failure 'shared: all packs are redundant, but no output 
without --alt-odb (failed on Mac)' '
        (
                cd "$shared_repo" &&
                git pack-redundant --all >out &&
    @@ -376,8 +376,8 @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_success 'pack-redundant --verbose: show duplicate packs in 
stderr' '
    -+test_expect_failure 'pack-redundant --verbose: show duplicate packs in 
stderr (failed on Mac)' '
    +-test_expect_success 'shared: show redundant packs in stderr for verbose 
mode' '
    ++test_expect_failure 'shared: show redundant packs in stderr for verbose 
mode (failed on Mac)' '
        (
                cd "$shared_repo" &&
                cat >expect <<-EOF &&
5:  b8f80ad454 = 5:  4af03876d4 pack-redundant: rename pack_list.all_objects
6:  8a12ad699e ! 6:  89ed4fb2a5 pack-redundant: consistent sort method
    @@ -96,8 +96,12 @@
      #     ALL | x x x x x x x x x x x x x x x x x   x
      #
      
#############################################################################
    --test_expect_failure 'one of pack-2/pack-3 is redundant (failed on Mac)' '
    -+test_expect_success 'one of pack-2/pack-3 is redundant' '
    +-test_expect_failure 'master: one of pack-2/pack-3 is redundant (failed on 
Mac)' '
    ++test_expect_success 'master: one of pack-2/pack-3 is redundant' '
    +   create_pack_in "$master_repo" P4 <<-EOF &&
    +           $J
    +           $K
    +@@
        (
                cd "$master_repo" &&
                cat >expect <<-EOF &&
    @@ -110,26 +114,26 @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_failure 'pack 2, 4, and 6 are redundant (failed on Mac)' '
    -+test_expect_success 'pack 2, 4, and 6 are redundant' '
    -   (
    -           cd "$master_repo" &&
    -           cat >expect <<-EOF &&
    +-test_expect_failure 'master: pack 2, 4, and 6 are redundant (failed on 
Mac)' '
    ++test_expect_success 'master: pack 2, 4, and 6 are redundant' '
    +   create_pack_in "$master_repo" P6 <<-EOF &&
    +           $N
    +           $O
     @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_failure 'pack-8 (subset of pack-1) is also redundant (failed 
on Mac)' '
    -+test_expect_success 'pack-8 (subset of pack-1) is also redundant' '
    -   (
    -           cd "$master_repo" &&
    -           cat >expect <<-EOF &&
    +-test_expect_failure 'master: pack-8 (subset of pack-1) is also redundant 
(failed on Mac)' '
    ++test_expect_success 'master: pack-8 (subset of pack-1) is also redundant' 
'
    +   create_pack_in "$master_repo" P8 <<-EOF &&
    +           $A
    +           EOF
     @@
        )
      '
      
    --test_expect_failure 'remove redundant packs and pass fsck (failed on 
Mac)' '
    -+test_expect_success 'remove redundant packs and pass fsck' '
    +-test_expect_failure 'master: remove redundant packs and pass fsck (failed 
on Mac)' '
    ++test_expect_success 'master: remove redundant packs and pass fsck' '
        (
                cd "$master_repo" &&
                git pack-redundant --all | xargs rm &&
    @@ -137,8 +141,8 @@
        )
      '
      
    --test_expect_failure 'no redundant packs without --alt-odb (failed on 
Mac)' '
    -+test_expect_success 'no redundant packs without --alt-odb' '
    +-test_expect_failure 'shared: all packs are redundant, but no output 
without --alt-odb (failed on Mac)' '
    ++test_expect_success 'shared: all packs are redundant, but no output 
without --alt-odb' '
        (
                cd "$shared_repo" &&
                git pack-redundant --all >out &&
    @@ -146,8 +150,8 @@
      #     ALL | x x x x x x x x x x x x x x x x x x x
      #
      
#############################################################################
    --test_expect_failure 'pack-redundant --verbose: show duplicate packs in 
stderr (failed on Mac)' '
    -+test_expect_success 'pack-redundant --verbose: show duplicate packs in 
stderr' '
    +-test_expect_failure 'shared: show redundant packs in stderr for verbose 
mode (failed on Mac)' '
    ++test_expect_success 'shared: show redundant packs in stderr for verbose 
mode' '
        (
                cd "$shared_repo" &&
                cat >expect <<-EOF &&

--

Jiang Xin (4):
  t5323: test cases for git-pack-redundant
  pack-redundant: delay creation of unique_objects
  pack-redundant: rename pack_list.all_objects
  pack-redundant: consistent sort method

Sun Chao (2):
  pack-redundant: delete redundant code
  pack-redundant: new algorithm to find min packs

 builtin/pack-redundant.c  | 232 ++++++++-----------
 t/t5323-pack-redundant.sh | 467 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 559 insertions(+), 140 deletions(-)
 create mode 100755 t/t5323-pack-redundant.sh

-- 
2.20.1.103.ged0fc2ca7b

Reply via email to