Re: [PATCH v2 5/5] fetch: implement fetch.fsck.*

2018-05-31 Thread Ævar Arnfjörð Bjarmason


On Wed, May 30 2018, Junio C Hamano wrote:

> Earlier I mumbled "this 4-patch series generally looks good but I
> need to re-read the implementation step"; I meant this 5-patch
> series and here is my impression after re-reading the implementation
> step.
>
> Ævar Arnfjörð Bjarmason   writes:
>
>> diff --git a/Documentation/config.txt b/Documentation/config.txt
>> index f97f21c022..f69cd31ad0 100644
>> --- a/Documentation/config.txt
>> +++ b/Documentation/config.txt
>> @@ -1426,6 +1426,16 @@ fetch.fsckObjects::
>> ...
>
> Nicely written.
>
>> diff --git a/fetch-pack.c b/fetch-pack.c
>> index 490c38f833..9e4282788e 100644
>> --- a/fetch-pack.c
>> +++ b/fetch-pack.c
>> @@ -19,6 +19,7 @@
>>  #include "sha1-array.h"
>>  #include "oidset.h"
>>  #include "packfile.h"
>> +#include "fsck.h"
>>
>>  static int transfer_unpack_limit = -1;
>>  static int fetch_unpack_limit = -1;
>> @@ -33,6 +34,7 @@ static int agent_supported;
>>  static int server_supports_filtering;
>>  static struct lock_file shallow_lock;
>>  static const char *alternate_shallow_file;
>> +static struct strbuf fsck_msg_types = STRBUF_INIT;
>
> s/msg_types[]/exclude[]/ or something, as this is not just about "we
> tolerate known and future instances of 0-padded mode in trees", but
> also "we know this and that object is bad so do not complain" as well.

I copied the fsck_msg_types variable as-is from builtin/receive-pack.c
which Johannes added in 5d477a334a ("fsck (receive-pack): allow demoting
errors to warnings", 2015-06-22).

That's not a justification for doing the same here, but does your
critique also extend to that variable name, so I could fix it there
while I'm at it?

> Other than that, the implementation looks good.
>
>> diff --git a/t/t5504-fetch-receive-strict.sh 
>> b/t/t5504-fetch-receive-strict.sh
>> index 49d3621a92..b7f5222c2b 100755
>> --- a/t/t5504-fetch-receive-strict.sh
>> +++ b/t/t5504-fetch-receive-strict.sh
>> @@ -135,6 +135,20 @@ test_expect_success 'push with receive.fsck.skipList' '
>>  git push --porcelain dst bogus
>>  '
>>
>> +test_expect_success 'fetch with fetch.fsck.skipList' '
>> +commit="$(git hash-object -t commit -w --stdin > +refspec=refs/heads/bogus:refs/heads/bogus &&
>> +git push . $commit:refs/heads/bogus &&
>> +rm -rf dst &&
>> +git init dst &&
>> +git --git-dir=dst/.git config fetch.fsckObjects true &&
>> +test_must_fail git --git-dir=dst/.git fetch "file://$(pwd)" $refspec &&
>> +git --git-dir=dst/.git config fetch.fsck.skipList dst/.git/SKIP &&
>> +echo $commit >dst/.git/SKIP &&
>> +git --git-dir=dst/.git fetch "file://$(pwd)" $refspec
>> +'
>
> If the test does not change meaning when file://$(pwd) is replaced
> with "." (or ".." or whatever if other tests below moves cwd
> around), I'd think it is preferrable.  Seeing $(pwd) always makes me
> nervous about Windows.

Thanks. Will fix.


Re: [PATCH v2 5/5] fetch: implement fetch.fsck.*

2018-05-29 Thread Junio C Hamano
Earlier I mumbled "this 4-patch series generally looks good but I
need to re-read the implementation step"; I meant this 5-patch
series and here is my impression after re-reading the implementation
step.

Ævar Arnfjörð Bjarmason   writes:

> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index f97f21c022..f69cd31ad0 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -1426,6 +1426,16 @@ fetch.fsckObjects::
> ...

Nicely written.

> diff --git a/fetch-pack.c b/fetch-pack.c
> index 490c38f833..9e4282788e 100644
> --- a/fetch-pack.c
> +++ b/fetch-pack.c
> @@ -19,6 +19,7 @@
>  #include "sha1-array.h"
>  #include "oidset.h"
>  #include "packfile.h"
> +#include "fsck.h"
>  
>  static int transfer_unpack_limit = -1;
>  static int fetch_unpack_limit = -1;
> @@ -33,6 +34,7 @@ static int agent_supported;
>  static int server_supports_filtering;
>  static struct lock_file shallow_lock;
>  static const char *alternate_shallow_file;
> +static struct strbuf fsck_msg_types = STRBUF_INIT;

s/msg_types[]/exclude[]/ or something, as this is not just about "we
tolerate known and future instances of 0-padded mode in trees", but
also "we know this and that object is bad so do not complain" as well.

Other than that, the implementation looks good.

> diff --git a/t/t5504-fetch-receive-strict.sh b/t/t5504-fetch-receive-strict.sh
> index 49d3621a92..b7f5222c2b 100755
> --- a/t/t5504-fetch-receive-strict.sh
> +++ b/t/t5504-fetch-receive-strict.sh
> @@ -135,6 +135,20 @@ test_expect_success 'push with receive.fsck.skipList' '
>   git push --porcelain dst bogus
>  '
>  
> +test_expect_success 'fetch with fetch.fsck.skipList' '
> + commit="$(git hash-object -t commit -w --stdin  + refspec=refs/heads/bogus:refs/heads/bogus &&
> + git push . $commit:refs/heads/bogus &&
> + rm -rf dst &&
> + git init dst &&
> + git --git-dir=dst/.git config fetch.fsckObjects true &&
> + test_must_fail git --git-dir=dst/.git fetch "file://$(pwd)" $refspec &&
> + git --git-dir=dst/.git config fetch.fsck.skipList dst/.git/SKIP &&
> + echo $commit >dst/.git/SKIP &&
> + git --git-dir=dst/.git fetch "file://$(pwd)" $refspec
> +'

If the test does not change meaning when file://$(pwd) is replaced
with "." (or ".." or whatever if other tests below moves cwd
around), I'd think it is preferrable.  Seeing $(pwd) always makes me
nervous about Windows.


[PATCH v2 5/5] fetch: implement fetch.fsck.*

2018-05-25 Thread Ævar Arnfjörð Bjarmason
Implement support for fetch.fsck.* corresponding with the existing
receive.fsck.*. This allows for pedantically cloning repositories with
specific issues without turning off fetch.fsckObjects.

One such repository is https://github.com/robbyrussell/oh-my-zsh.git
which before this change will emit this error when cloned with
fetch.fsckObjects:

error: object 2b7227859263b6aabcc28355b0b994995b7148b6: zeroPaddedFilemode: 
contains zero-padded file modes
fatal: Error in object
fatal: index-pack failed

Now with fetch.fsck.zeroPaddedFilemode=warn we'll warn about that
issue, but the clone will succeed:

warning: object 2b7227859263b6aabcc28355b0b994995b7148b6: 
zeroPaddedFilemode: contains zero-padded file modes
warning: object a18c4d13c2a5fa2d4ecd5346c50e119b999b807d: 
zeroPaddedFilemode: contains zero-padded file modes
warning: object 84df066176c8da3fd59b13731a86d90f4f1e5c9d: 
zeroPaddedFilemode: contains zero-padded file modes

The motivation for this is to be able to turn on fetch.fsckObjects
globally across a fleet of computers but still be able to manually
clone various legacy repositories by either white-listing specific
issues, or better yet whitelist specific objects.

The use of --git-dir=* instead of -C in the tests could be considered
somewhat archaic, but the tests I'm adding here are duplicating the
corresponding receive.* tests with as few changes as possible.

Signed-off-by: Ævar Arnfjörð Bjarmason 
---
 Documentation/config.txt| 21 +++
 fetch-pack.c| 32 +--
 t/t5504-fetch-receive-strict.sh | 46 +
 3 files changed, 92 insertions(+), 7 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index f97f21c022..f69cd31ad0 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1426,6 +1426,16 @@ fetch.fsckObjects::
checked. Defaults to false. If not set, the value of
`transfer.fsckObjects` is used instead.
 
+fetch.fsck.::
+   Acts like `fsck.`, but is used by
+   linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
+   the `fsck.` documentation for details.
+
+fetch.fsck.skipList::
+   Acts like `fsck.skipList`, but is used by
+   linkgit:git-fetch-pack[1] instead of linkgit:git-fsck[1]. See
+   the `fsck.skipList` documentation for details.
+
 fetch.unpackLimit::
If the number of objects fetched over the Git native
transfer is below this
@@ -1561,9 +1571,10 @@ fsck.::
repositories containing such data.
 +
 Setting `fsck.` will be picked up by linkgit:git-fsck[1], but
-to accept pushes of such data set `receive.fsck.` instead. The
-rest of the documentation discusses `fsck.*` for brevity, but the same
-applies for the corresponding `receive.fsck.*` variables.
+to accept pushes of such data set `receive.fsck.` instead, or
+to clone or fetch it set `fetch.fsck.`. The rest of the
+documentation discusses `fsck.*` for brevity, but the same applies for
+the corresponding `receive.fsck.*` and `fetch..*`. variables.
 +
 When `fsck.` is set, errors can be switched to warnings and
 vice versa by configuring the `fsck.` setting where the
@@ -1588,8 +1599,8 @@ fsck.skipList::
email addresses. Note: corrupt objects cannot be skipped with
this setting.
 +
-Like `fsck.` this variable has a corresponding
-`receive.fsck.skipList` variant.
+Like `fsck.` this variable has corresponding
+`receive.fsck.skipList` and `fetch.fsck.skipList` variants.
 
 gc.aggressiveDepth::
The depth parameter used in the delta compression
diff --git a/fetch-pack.c b/fetch-pack.c
index 490c38f833..9e4282788e 100644
--- a/fetch-pack.c
+++ b/fetch-pack.c
@@ -19,6 +19,7 @@
 #include "sha1-array.h"
 #include "oidset.h"
 #include "packfile.h"
+#include "fsck.h"
 
 static int transfer_unpack_limit = -1;
 static int fetch_unpack_limit = -1;
@@ -33,6 +34,7 @@ static int agent_supported;
 static int server_supports_filtering;
 static struct lock_file shallow_lock;
 static const char *alternate_shallow_file;
+static struct strbuf fsck_msg_types = STRBUF_INIT;
 
 /* Remember to update object flag allocation in object.h */
 #define COMPLETE   (1U << 0)
@@ -935,7 +937,8 @@ static int get_pack(struct fetch_pack_args *args,
 */
argv_array_push(, "--fsck-objects");
else
-   argv_array_push(, "--strict");
+   argv_array_pushf(, "--strict%s",
+fsck_msg_types.buf);
}
 
cmd.in = demux.out;
@@ -1409,6 +1412,31 @@ static struct ref *do_fetch_pack_v2(struct 
fetch_pack_args *args,
return ref;
 }
 
+static int fetch_pack_config_cb(const char *var, const char *value, void *cb)
+{
+   if (strcmp(var, "fetch.fsck.skiplist") == 0) {
+   const char *path;
+
+   if (git_config_pathname(, var, value))
+