Re: [PATCH 16/18] fsck: support demoting errors to warnings

2015-01-31 Thread Johannes Schindelin
Hi Michael  Junio,

On 2015-01-22 18:17, Johannes Schindelin wrote:

 [...] we need to avoid confusing settings such as
 
 ```
 [receive.fsck]
 warn = missing-tagger-entry
 error = missing-tagger-entry
 ```

I *think* I found a solution.

Please let me recapitulate quickly the problem Michael brought up: if we 
support `receive.fsck.warn` to override `receive.fsck.error` and vice versa, 
with comma-separated lists, then it can be quite confusing to the user, and 
actually quite difficult to figure out on the command-line which setting is in 
effect (because it really depends on the *order* of the receive.fsck.* lines, 
*plus* the fact that the values are comma-separated lists).

On the other hand, Junio pointed out two shortcomings with my original 
implementation (i.e. to support `receive.fsck.id = (error|warn|ignore)`), 
however: it is tedious to set multiple severity levels, and it violates the 
config file convention that the config variable names are CamelCased (the 
message IDs are dashed-lowercase instead).

The solution I just implemented (and will send out shortly in v4 of the patch 
series) is the following: the config variable is called receive.fsck.severity 
and it accepts comma-separated settings. Example:

```
[receive fsck]
severity = multiple-authors=ignore,missing-tagger=error
```

Now, it is *still* not the easiest to figure out the setting from the 
command-line:

```sh
$ git config --get-all receive.fsck.severity |
tr , \n |
grep ^multiple-authors= |
tail -n 1
```

But I hope this is good enough, Michael?

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2015-01-22 Thread Michael Haggerty
On 12/23/2014 06:14 PM, Junio C Hamano wrote:
 Johannes Schindelin johannes.schinde...@gmx.de writes:
 On Tue, 23 Dec 2014, Junio C Hamano wrote:
 I suspect that it would be much better if the configuration variables
 were organized the other way around, e.g.

 $ git config fsck.warn missingTagger,someOtherKindOfError

 I had something similar in an earlier version of my patch series, but it
 was shot down rightfully: if you want to allow inheriting defaults from
 $HOME/.gitconfig, you have to configure the severity levels individually.
 
 Hmmm.  What's wrong with fsck.warn -missingTagger that overrides
 the earlier one, or even fsck.info missingTagger after having
 fsck.warn other,missingTagger,yetanother, with the usual last one
 wins rule?
 
 Whoever shot it down rightfully is wrong here, I would think.

Sorry I didn't notice this earlier; Johannes, please CC me on these
series, especially the ones that I commented on earlier.

I might have been the one who shot down the severity=name style
of configuration [1].

I don't feel strongly enough to make a big deal about this, especially
considering that the other alternative has already been implemented. But
for the record, let me explain why I prefer the name=severity
style of configuration.

First, it is a truer representation of the data structure within the
software, which is basically one severity value for each error type.
This is not a decisive argument, but it often means that there is less
impedance mismatch between the style of configuration and the concepts
that it is configuring. For example,

$ git config receive.fsck.warn A,B,C
$ git config receive.fsck.error C,D,E

seems to be configuring two sets, but it is not. It is mysteriously
setting C to be an error, in seeming contradiction of the first line [2].

Second, it is not correct to say that this is just an application of the
last setting wins rule. The last setting wins rule has heretofore,
as far as I know, only covered *single* settings that take a single
value. If we applied that rule to the following:

$ git config receive.fsck.warn A,B,C
$ git config receive.fsck.warn B,F

then the net result would be B,F. But that is not your proposal at
all; your proposal is for these two settings to be interpreted the same as

$ git config receive.fsck.warn A,B,C,F

Similarly, the traditional last setting rule, applied to the first
example above, wouldn't cause the value of fsck.warn to be reduced to
A,B, as you propose. This is not the last setting rule that we are
familiar with--it operates *across and within* values and across
*multiple* names rather than just across the values for a single name.

Third, the severity=name style is hard to inquire via the command
line, and probably also incompatible with the simplified internal config
API in git (and probably libgit2, JGit, etc). The problem is that
determining a *single* setting requires *three* configuration variables
be inquired, and that the settings for those three variables need to be
processed in the correct order, including the correct order of
interleavings. For example, how would you inquire about the configured
severity level of missingTaggerEntry using the shell? It would be a
mess that would necessarily have to involve git config --get-regexp
and error-prone parsing of comma-separated values. It would be so much
easier to type

$ git config receive.fsck.missingtaggerentry

Fourth, the severity=name style would cause config files to get
cluttered up with unused values. Suppose you have earlier run

$ git config receive.fsck.warn A,B,C
$ git config receive.fsck.ignore D,E

and now you want to demote B to ignore. You can do

$ git config --add receive.fsck.ignore B

(don't forget --add or you've silently erased other, unrelated
settings!) This gives the behavior that you want. But now your config
file looks like

[receive fsck]
warn = A,B,C
ignore = D,E
ignore = B

The B on the first line is now just being carried along for no reason,
but it would be quite awkward to clean it up programmatically.
Effectively, these settings can only be added to but never removed
because of the way multiple properties are mashed into a single setting.


I believe that one of the main arguments for the severity=name
style of configuration is that it carries over more easily into
convenient command-line options. But I think it will be unusual to want
to configure these options by hand on the command line, let alone adjust
many settings at the same time. The idea isn't to make it easy to work
with repositories that have a level of breakage that fluctuates over
time. It is to make it possible to work with *specific* repositories
that have known breakage in their history. For such a repo you would
configure one or two ignore options one time and then never adjust
them again. (And it will also allow us to make our checks stricter in
the future without breaking existing 

Re: [PATCH 16/18] fsck: support demoting errors to warnings

2015-01-22 Thread Johannes Schindelin
Hi Michael,

On 2015-01-22 16:49, Michael Haggerty wrote:
 On 12/23/2014 06:14 PM, Junio C Hamano wrote:
 Johannes Schindelin johannes.schinde...@gmx.de writes:
 On Tue, 23 Dec 2014, Junio C Hamano wrote:
 I suspect that it would be much better if the configuration variables
 were organized the other way around, e.g.

$ git config fsck.warn missingTagger,someOtherKindOfError

 I had something similar in an earlier version of my patch series, but it
 was shot down rightfully: if you want to allow inheriting defaults from
 $HOME/.gitconfig, you have to configure the severity levels individually.

 Hmmm.  What's wrong with fsck.warn -missingTagger that overrides
 the earlier one, or even fsck.info missingTagger after having
 fsck.warn other,missingTagger,yetanother, with the usual last one
 wins rule?

 Whoever shot it down rightfully is wrong here, I would think.
 
 Sorry I didn't notice this earlier; Johannes, please CC me on these
 series, especially the ones that I commented on earlier.

Very sorry, this is my fault. It can only be explained by my switching around 
some tools for other tools to work with email-based patch submission (which I 
had not done in a long time). But still, my mistake.


 [1] I prefer to think that I just offered a little gentle discussion
 that informed Johannes's independent decision :-)

You did convince me back then. I just did not want to put up a fight against 
Junio because I was more interested in getting this feature merged before the 
holidays (it does feel awkward for me to leave work unwrapped-up before leaving 
for an extended amount of time, but I guess I am getting more used to that).

So now I cannot avoid discussing this issue properly...

In essence, I agreed with Junio from the point of view of an elegant 
implementation. But then, Michael is correct that it does not really matter as 
much how complicated the code is, but that it is much more important that the 
feature is elegant to use.

Now let's step back a bit and think about the users which is supposed to be 
supported by this patch series: Git repository hosters -- such as GitHub -- 
need to ensure a certain cleanliness of the repositories they host (for a range 
of reasons, including the prevention of malicious attacks, or helping users 
publish their code in a correct form).

And the scenario in which the feature needs to be used is most likely started 
by some Git user pushing some commits, and `git receive-pack` triggering an 
error. Then the user files a trouble ticket and GitHubber needs to inspect the 
error and the respective object. Now, in the vast number of cases I imagine 
that the objects *are* faulty. However, on occasion the problem should not 
prevent the push, e.g. when somebody crafted a commit object with two authors, 
forgetting that the tools usually cannot handle such commits. Then the 
GitHubber has to decide on a case by case basis whether to demote that error to 
a warning and allow the object to be pushed *into that specific repository*.

I do see the need for this feature to be simple and robust, from the users' 
point of view. In other words, I agree with Michael that we need to avoid 
confusing settings such as

```
[receive.fsck]
warn = missing-tagger-entry
error = missing-tagger-entry
```

This feature will be used rarely enough that the poor soul stuck with 
interpreting the above config section won't remember that a very specific 
version of last setting wins is in effect.

If I remember correctly, Peff suggested that there needs to be a way to handle 
these settings in the /etc/gitconfig $HOME/.gitconfig $XDG../gitconfig 
.git/config cascade, but now I am puzzled whether it is even desirable to 
demote fsck errors globally, i.e. whether we really need to pay attention to 
that config cascade.

And finally, in the course of preparing this patch series, we came up with an 
alternative solution to the problem: the receive.fsck.skiplist (i.e. a file 
that contains a sorted list of SHA-1s of objects that should be skipped from 
fsck'ing). I am more and more convinced that this is the most convenient tool 
for the scenario described above: manual inspection of individual objects will 
tell whether it is safe to allow them onto the server or not.

However, others might disagree and prefer the explicit approach, e.g. when some 
source generates a consistent stream of objects triggering fsck errors.

Summary: I have no preference how to specify the severity levels of fsck 
messages, but I will gladly change my code to whatever you (meaning Junio and 
Michael in particular) want to see implemented.

Thanks for helping me with this feature,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  On Mon, 22 Dec 2014, Junio C Hamano wrote:
 
  Johannes Schindelin johannes.schinde...@gmx.de writes:
  
   Of course you can say that! ;-) The problem these ugly messages try to
   solve is to give the user a hint which setting to change if they want to
   override the default behavior, though...
  
  Ahh, OK, then dashed form would not work as a configuration variable
  names, so missingTaggerEntry would be the only usable option.
 
  Except that cunning me has made it so that both missing-tagger-entry *and*
  missingTaggerEntry work...
 
 Then the missing-tagger-entry side needs to be dropped.  The naming
 does not conform to the way how we name our officially supported
 configuration variables.

But it does conform with the way we do our command-line parameters, and it
would actually cause *more* work (and more complicated code) to have two
separate parsers, or even to force the parser to accept only one way to
specify settings...

Should I really introduce more complexity just to disallow non-camelCased
config variables?

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

   Of course you can say that! ;-) The problem these ugly messages try to
   solve is to give the user a hint which setting to change if they want to
   override the default behavior, though...
  
  Ahh, OK, then dashed form would not work as a configuration variable
  names, so missingTaggerEntry would be the only usable option.
 
  Except that cunning me has made it so that both missing-tagger-entry *and*
  missingTaggerEntry work...
 
 Then the missing-tagger-entry side needs to be dropped.  The naming
 does not conform to the way how we name our officially supported
 configuration variables.

 But it does conform with the way we do our command-line parameters,

Hmmm  What is the expected user interaction?  The way I read the
series was ($MISSING_TAGGER stands for the token we choose to show):

(1) The user runs fsck without customization, and may see a
warning (or error):

$ git fsck
error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

(2) The user demotes it to warning and runs fsck again:

$ git config fsck.$MISSING_TAGGER warn
$ git fsck
warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

I suspect that it would be much better if the configuration
variables were organized the other way around, e.g.

$ git config fsck.warn missingTagger,someOtherKindOfError

Then a one-shot override would make sense and easier to give as
command line option, e.g.

$ git fsck --warn=missingTagger,someOtherKindOfError

But the proposed organization to use one variable per questionable
event type (as opposed to one variable per severity level) would
lead to a one-shot override of this form, e.g.

$ git fsck --missing-tagger=warn --some-other-kind-of-error=warn

which I think is insane to require us to support unbound number of
dashed options.

Or are you saying that we allow git config core.file-mode true
from the command line to set core.fileMode configuration?

Puzzled...
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
Of course you can say that! ;-) The problem these ugly messages
try to solve is to give the user a hint which setting to change
if they want to override the default behavior, though...
   
   Ahh, OK, then dashed form would not work as a configuration
   variable names, so missingTaggerEntry would be the only usable
   option.
  
   Except that cunning me has made it so that both
   missing-tagger-entry *and* missingTaggerEntry work...
  
  Then the missing-tagger-entry side needs to be dropped.  The naming
  does not conform to the way how we name our officially supported
  configuration variables.
 
  But it does conform with the way we do our command-line parameters,
 
 Hmmm  What is the expected user interaction?  The way I read the
 series was ($MISSING_TAGGER stands for the token we choose to show):
 
 (1) The user runs fsck without customization, and may see a
   warning (or error):
 
 $ git fsck
 error in tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER
 
 (2) The user demotes it to warning and runs fsck again:
 
   $ git config fsck.$MISSING_TAGGER warn
 $ git fsck
 warning: tag d6602ec5194c87b0fc87103ca4d67251c76f233a: $MISSING_TAGGER

The intended use case is actually when receive.fsckObjects = true and you
call `git push`, seeing 'remote: error: $MULTIPLE_AUTHORS: ...'.

Now, the $MULTIPLE_AUTHORS *config* setting is parsed by `git
receive-pack`, but that is not the command that needs to customize the
fsck call: it is either `git index-pack` or `git unpack-objects`. So what
`git receive-pack` does is to pass the config options as command-line
options to the called command. For consistency with the rest of Git, the
command-line options were *not* camel-cased, but lower-case,
dash-separated.

The parser I wrote actually accepts both versions, allowing me to skip the
tedious step to convert the camelCased config setting into a
lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
only to be parsed by the same parser as `fsck` would use directly.

So I am rather happy with the fact that the parser handles both camelCased
and lower-case-dashed versions.

 I suspect that it would be much better if the configuration variables
 were organized the other way around, e.g.
 
   $ git config fsck.warn missingTagger,someOtherKindOfError

I had something similar in an earlier version of my patch series, but it
was shot down rightfully: if you want to allow inheriting defaults from
$HOME/.gitconfig, you have to configure the severity levels individually.

(The current solution also sidesteps the problematic situation when both
fsck.warn *and* fsck.error contain, say, missingTagger.)

 Then a one-shot override would make sense and easier to give as
 command line option, e.g.
 
   $ git fsck --warn=missingTagger,someOtherKindOfError

Yep, my first implementation actually used
`--strict=missing-tagger,-some-demoted-error`. But as I mentioned above,
that approach is not as flexible as the current one.

 But the proposed organization to use one variable per questionable
 event type (as opposed to one variable per severity level) would
 lead to a one-shot override of this form, e.g.
 
   $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
 
 which I think is insane to require us to support unbound number of
 dashed options.

The intended use case is actually *not* the command-line, but the config
file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
.git/config to customize the settings.

 Or are you saying that we allow git config core.file-mode true
 from the command line to set core.fileMode configuration?

I do not understand this reference. I did not suggest to change `git
config`, did I? If I did, I apologize; it was definitely *not* my
intention to change long-standing customs.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 I suspect that it would be much better if the configuration
 variables were organized the other way around, e.g.

   $ git config fsck.warn missingTagger,someOtherKindOfError

By the way, I think I like this organization is much better than the
other way around, i.e. fsck.missingTagger=warn, as we do not want
the namespace under fsck.* for variables that control the behaviour
of fsck that are *NOT* kinds of questionable conditions fsck can
find.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 The parser I wrote actually accepts both versions, allowing me to skip the
 tedious step to convert the camelCased config setting into a
 lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
 only to be parsed by the same parser as `fsck` would use directly.

 So I am rather happy with the fact that the parser handles both camelCased
 and lower-case-dashed versions.

That is myopic view of the world that ignores maintainability and
teachability, doing disservice to our user base.

What message does it send to an unsuspecting new user that
fsck.random-error is silently accepted (because we will never
document it) as an alias for fsck.randomError, while most of the
configuration variables will not take such an alias?

 I suspect that it would be much better if the configuration variables
 were organized the other way around, e.g.
 
  $ git config fsck.warn missingTagger,someOtherKindOfError

 I had something similar in an earlier version of my patch series, but it
 was shot down rightfully: if you want to allow inheriting defaults from
 $HOME/.gitconfig, you have to configure the severity levels individually.

Hmmm.  What's wrong with fsck.warn -missingTagger that overrides
the earlier one, or even fsck.info missingTagger after having
fsck.warn other,missingTagger,yetanother, with the usual last one
wins rule?

Whoever shot it down rightfully is wrong here, I would think.

 But the proposed organization to use one variable per questionable
 event type (as opposed to one variable per severity level) would
 lead to a one-shot override of this form, e.g.
 
  $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
 
 which I think is insane to require us to support unbound number of
 dashed options.

 The intended use case is actually *not* the command-line, but the config
 file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
 .git/config to customize the settings.

But we do need to worry about one-shot override from the command
line.  A configuration that sticks without a way to override is a
no-no.

 Or are you saying that we allow git config core.file-mode true
 from the command line to set core.fileMode configuration?

 I do not understand this reference.

I was puzzled by your command line and wondering if you meant
from the command line, aVariable can be spelled a-variable.

 I did not suggest to change `git
 config`, did I? If I did, I apologize; it was definitely *not* my
 intention to change long-standing customs.

Then fsck.missing-tagger is definitely out.  Long standing customs
is that a multi-word token at the first and the last level is not
dashed-multi-word.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  The parser I wrote actually accepts both versions, allowing me to skip
  the tedious step to convert the camelCased config setting into a
  lower-case-dashed version to pass to `index-pack` or `unpack-objects`,
  only to be parsed by the same parser as `fsck` would use directly.
 
  So I am rather happy with the fact that the parser handles both
  camelCased and lower-case-dashed versions.
 
 That is myopic view of the world that ignores maintainability and
 teachability, doing disservice to our user base.

Okay, so just to clarify: you want me to

- split the parser into

- a parser that accepts only camelCased variable names when they
  come from the config (for use in fsck and receive-pack), and

- another parser that rejects camelCased variable names and only
  accepts lower-case-dashed, intended for command-line parsing
  in fsck, index-pack and unpack-objects, and

- consequently have a converter from the camelCased variable names we
  receive from the config in receive-pack so we can pass lower-case-dashed
  settings to index-pack and unpack-objects.

If you want it this way, I will do it this way.

 What message does it send to an unsuspecting new user that
 fsck.random-error is silently accepted (because we will never document
 it) as an alias for fsck.randomError, while most of the configuration
 variables will not take such an alias?

I will not participate in a discussion about consistency again. There is
nothing that can be done about it; what matters is what you will accept
and what not. I will make the code stricter (and consequently more
complex) if that is what you want.

  I suspect that it would be much better if the configuration variables
  were organized the other way around, e.g.
  
 $ git config fsck.warn missingTagger,someOtherKindOfError
 
  I had something similar in an earlier version of my patch series, but
  it was shot down rightfully: if you want to allow inheriting defaults
  from $HOME/.gitconfig, you have to configure the severity levels
  individually.
 
 Hmmm.  What's wrong with fsck.warn -missingTagger that overrides
 the earlier one, or even fsck.info missingTagger after having
 fsck.warn other,missingTagger,yetanother, with the usual last one
 wins rule?

I will change the code (next year...).

  But the proposed organization to use one variable per questionable
  event type (as opposed to one variable per severity level) would lead
  to a one-shot override of this form, e.g.
  
 $ git fsck --missing-tagger=warn --some-other-kind-of-error=warn
  
  which I think is insane to require us to support unbound number of
  dashed options.
 
  The intended use case is actually *not* the command-line, but the config
  file, in particular allowing /etc/gitconfig, $HOME/.gitconfig *and*
  .git/config to customize the settings.
 
 But we do need to worry about one-shot override from the command
 line.  A configuration that sticks without a way to override is a
 no-no.

And of course you can, by specifying the config setting via the -c
command-line option. The only inconsistency here is that all other
command-line options are lower-case-dashed, while the config settings are
camelCased.

  Or are you saying that we allow git config core.file-mode true from
  the command line to set core.fileMode configuration?
 
  I do not understand this reference.
 
 I was puzzled by your command line and wondering if you meant
 from the command line, aVariable can be spelled a-variable.

Well, of course, if you call `git -c aVariable command
--option=a-variable` you have a nice accumulation of styles right there
;-)

  I did not suggest to change `git config`, did I? If I did, I
  apologize; it was definitely *not* my intention to change
  long-standing customs.
 
 Then fsck.missing-tagger is definitely out.  Long standing customs
 is that a multi-word token at the first and the last level is not
 dashed-multi-word.

But I already changed all of the patches to fsck.missingTagger.

The only thing I did not do yet is to split the parser into two, one
accepting only camelCased, one accepting only lower-case-dashed options,
and a translator to convert from camelCase to lower-case-dashed versions
(because it is a lot of work and additional complexity, as well as
opportunity for bugs to hide because we'll have three code paths). I asked
you above whether you want that, and I will do it if you say that you do.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Okay, so just to clarify: you want me to

 - split the parser into

   - a parser that accepts only camelCased variable names when they
 come from the config (for use in fsck and receive-pack), and

OK.

   - another parser that rejects camelCased variable names and only
 accepts lower-case-dashed, intended for command-line parsing
 in fsck, index-pack and unpack-objects, and

 - consequently have a converter from the camelCased variable names we
   receive from the config in receive-pack so we can pass lower-case-dashed
   settings to index-pack and unpack-objects.

I am not sure about the latter two.  This needs a design discussion
what the command line options should be.

I think the command line should be like this:

git cmd --warn=missingTags,missingAuthor

in the first place, i.e. we may invent tokens to denote new kinds
of errors as we improve fsck, not with we may add options for new
kinds of errors, i.e. the command line should not look like this:

git cmd --missing-tags=warn --missing-author=warn

And from that point of view, I see no reason to support the dashed
variant anywhere in the code, neither in the config parser or in the
command line parser.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  Okay, so just to clarify: you want me to
 
  - split the parser into
 
  - a parser that accepts only camelCased variable names when they
  come from the config (for use in fsck and receive-pack), and
 
 OK.
 
  - another parser that rejects camelCased variable names and only
accepts lower-case-dashed, intended for command-line parsing
in fsck, index-pack and unpack-objects, and
 
  - consequently have a converter from the camelCased variable names we
receive from the config in receive-pack so we can pass lower-case-dashed
settings to index-pack and unpack-objects.
 
 I am not sure about the latter two.  This needs a design discussion
 what the command line options should be.
 
 I think the command line should be like this:
 
   git cmd --warn=missingTags,missingAuthor

Okay. This contradicts the convention where Git uses lower-case-dashed
command-line option values (e.g. on-demand, error-all, etc) and no
camelCased options were present so far. But your wish is my command.

Ciao,
Dscho
 
 in the first place, i.e. we may invent tokens to denote new kinds
 of errors as we improve fsck, not with we may add options for new
 kinds of errors, i.e. the command line should not look like this:
 
   git cmd --missing-tags=warn --missing-author=warn
 
 And from that point of view, I see no reason to support the dashed
 variant anywhere in the code, neither in the config parser or in the
 command line parser.
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Johannes Schindelin johannes.schinde...@gmx.de writes:

 Okay, so just to clarify: you want me to

 - split the parser into

  - a parser that accepts only camelCased variable names when they
come from the config (for use in fsck and receive-pack), and

 OK.

  - another parser that rejects camelCased variable names and only
accepts lower-case-dashed, intended for command-line parsing
in fsck, index-pack and unpack-objects, and

 - consequently have a converter from the camelCased variable names we
   receive from the config in receive-pack so we can pass lower-case-dashed
   settings to index-pack and unpack-objects.

 I am not sure about the latter two.  This needs a design discussion
 what the command line options should be.

 I think the command line should be like this:

   git cmd --warn=missingTags,missingAuthor

 in the first place, i.e. we may invent tokens to denote new kinds
 of errors as we improve fsck, not with we may add options for new
 kinds of errors, i.e. the command line should not look like this:

   git cmd --missing-tags=warn --missing-author=warn

 And from that point of view, I see no reason to support the dashed
 variant anywhere in the code, neither in the config parser or in the
 command line parser.

Having said that, I think missingTags etc. should not be
configuration variable names (instead, they should be values).

Because of that, I do not think we need consistency between the way
these tokens that denote kinds of errors fsck denotes are spelled
and the way configuration variable names are spelled.

In other words, I do not think there is nothing that comes from how
our configuration variable names are spelled that gives preference
to one over the other between the two styles:

(1) Tokens are camelCased

[fsck]
warn = missingTagger,missingAuthor
error = zeroPadTreeEntry

$ git cmd --warn=missingTagger,missingAuthor

(2) Tokens are dashed-multi-words

[fsck]
warn = missing-tagger,missing-author
error = zero-pad-tree-entry

$ git cmd --warn=missing-tagger,missing-author

Do I have a strong preference between these two?

Not really.  My gut reaction is that (2) may be easier to read, but
I can be persuaded either way.

Who else has/had opinions on this?  Earlier you said that the
configuration the other way, i.e. [fsck] warn = missingTag, was
shot down---who did shoot it?  Perhaps that person may be able to
point out where in my thinking above I am going in the wrong
direction.

Thanks.

[Footnote]

In either case, I'd recommend that we take [ ,]+ as inter-token
separator to ease the use on the command line and config file, to
allow these:

[fsck]
warn = missingTagger missingAuthor
warn = missingTagger,missingAuthor

$ git cmd --warn missingTagger,missingAuthor
$ git cmd --warn missingTagger missingAuthor

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Having said that, I think missingTags etc. should not be configuration
 variable names (instead, they should be values).
 
 Because of that, I do not think we need consistency between the way
 these tokens that denote kinds of errors fsck denotes are spelled and
 the way configuration variable names are spelled.

Okay. That makes more sense.

Now I can remove the complexity introduced by teaching the parser to
accept camelCased values, and we're golden.

 In either case, I'd recommend that we take [ ,]+ as inter-token
 separator to ease the use on the command line and config file

And this is indeed the case:

+void fsck_strict_mode(struct fsck_options *options, const char *mode)
+...
+   while (*mode) {
+   int len = strcspn(mode,  ,|), equal, msg_id;
+...

In other words, I even allowed the pipe symbol as separator.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Hi Junio,

 On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Having said that, I think missingTags etc. should not be configuration
 variable names (instead, they should be values).
 
 Because of that, I do not think we need consistency between the way
 these tokens that denote kinds of errors fsck denotes are spelled and
 the way configuration variable names are spelled.

 Okay. That makes more sense.

I am sorry that I didn't step back and think about it earlier to
notice that we shouldn't be talking about configuration variable
name syntax.  I could have saved us time going back and forth if
I did so earlier.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Johannes Schindelin
Hi Junio,

On Tue, 23 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  On Tue, 23 Dec 2014, Junio C Hamano wrote:
 
  Having said that, I think missingTags etc. should not be
  configuration variable names (instead, they should be values).
  
  Because of that, I do not think we need consistency between the way
  these tokens that denote kinds of errors fsck denotes are spelled
  and the way configuration variable names are spelled.
 
  Okay. That makes more sense.
 
 I am sorry that I didn't step back and think about it earlier to notice
 that we shouldn't be talking about configuration variable name syntax.
 I could have saved us time going back and forth if I did so earlier.

Do not worry. You were just trying to make this software better, same as
I tried.

Unfortunately, I will not be able to submit v2 of this patch series this
year, but I will do so in the second week of January (including the change
to the global array with the default severity levels because I do want to
see this feature integrated).

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Unfortunately, I will not be able to submit v2 of this patch series this
 year, but I will do so in the second week of January (including the change
 to the global array with the default severity levels because I do want to
 see this feature integrated).

Heh, we are not in a hurry.  Enjoy your holidays.

A happy new year to you in advance ;-)

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Johannes Schindelin
Hi Junio,

On Wed, 10 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  We already have support in `git receive-pack` to deal with some legacy
  repositories which have non-fatal issues.
 
  Let's make `git fsck` itself useful with such repositories, too, by
  allowing users to ignore known issues, or at least demote those issues
  to mere warnings.
 
  Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
  missing emails in author, committer and tagger lines.
 
 Hopefully I do not have to repeat myself, but please do not have
 punctuations in the first and the last level of configuration variable
 names, i.e. fsck.missingEmail, not mising-email.

I vetted the complete patch series and think I caught them all.

Or do you want the error messages to also use camelCased IDs, i.e.

warning in tag $tag: missingTaggerEntry: invalid format ...

instead of

warning in tag $tag: missing-tagger-entry: invalid format ...

? It is easy to do, but looks slightly uglier to this developer's eyes...

 Should these be tied to receive-pack ones in any way?  E.g. if you
 set fsck.missingEmail to ignore, you do not have to do the same for
 receive and accept a push with the specific error turned off?
 
 Not a rhetorical question.  I can see it argued both ways.  The
 justification to defend the position of not tying these two I would
 have is so that I can be more strict to newer breakages (i.e. not
 accepting a push that introduce a new breakage by not ignoring with
 receive.fsck.*) while allowing breakages that are already present.
 The justification for the opposite position is to make it more
 convenient to write a consistent policy.  Whichever way is chosen,
 we would want to see the reason left in the log message so that
 people do not have to wonder what the original motivation was when
 they decide if it is a good idea to change this part of the code.

Hmm. I really tried very hard to separate the fsck.* from the receive.*
settings because the two code paths already behave differently: many
warnings reported (and ignored) by fsck are fatal errors when pushing with
receive.fsckObjects=true. My understanding was that these differences are
deliberate, and my interpretation was that the fsck and the receive
settings were considered to be fundamentally different, even if the same
fsck machinery performs the validation.

If you agree, I would rephrase this line of argument and add it to the
commit message. Do you agree?

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Or do you want the error messages to also use camelCased IDs, i.e.

   warning in tag $tag: missingTaggerEntry: invalid format ...

 instead of

   warning in tag $tag: missing-tagger-entry: invalid format ...

 ? It is easy to do, but looks slightly uglier to this developer's eyes...

Do you really need to know what I think?  Can I say The latter is
probably slightly better, but both look ugly to me?

If we want a human readable message

warning: tag object lacks tagger field '$tag'

would be my preference.

But I personally think it may not be necessary to give a pretty
message that later can go through l10n here.  If we take that
position, it is just a machine-readble error token, so I'd say
whichever way you find more readable is OK.

 Should these be tied to receive-pack ones in any way?  E.g. if you
 set fsck.missingEmail to ignore, you do not have to do the same for
 receive and accept a push with the specific error turned off?
 
 Not a rhetorical question.  I can see it argued both ways.  The
 justification to defend the position of not tying these two I would
 have is so that I can be more strict to newer breakages (i.e. not
 accepting a push that introduce a new breakage by not ignoring with
 receive.fsck.*) while allowing breakages that are already present.
 The justification for the opposite position is to make it more
 convenient to write a consistent policy.  Whichever way is chosen,
 we would want to see the reason left in the log message so that
 people do not have to wonder what the original motivation was when
 they decide if it is a good idea to change this part of the code.

 Hmm. I really tried very hard to separate the fsck.* from the receive.*
 settings because the two code paths already behave differently:...

 If you agree, I would rephrase this line of argument and add it to the
 commit message. Do you agree?

Yeah, that reasoning sounds sensible.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Johannes Schindelin
Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  Or do you want the error messages to also use camelCased IDs, i.e.
 
  warning in tag $tag: missingTaggerEntry: invalid format ...
 
  instead of
 
  warning in tag $tag: missing-tagger-entry: invalid format ...
 
  ? It is easy to do, but looks slightly uglier to this developer's eyes...
 
 Do you really need to know what I think?

Well, but yes ;-)

 Can I say The latter is probably slightly better, but both look ugly to
 me?

Of course you can say that! ;-) The problem these ugly messages try to
solve is to give the user a hint which setting to change if they want to
override the default behavior, though...

 If we want a human readable message
 
 warning: tag object lacks tagger field '$tag'
 
 would be my preference.
 
 But I personally think it may not be necessary to give a pretty
 message that later can go through l10n here.  If we take that
 position, it is just a machine-readble error token, so I'd say
 whichever way you find more readable is OK.

Okay, I will leave it as-is, then.

  Should these be tied to receive-pack ones in any way?  E.g. if you
  set fsck.missingEmail to ignore, you do not have to do the same for
  receive and accept a push with the specific error turned off?
  
  Not a rhetorical question.  I can see it argued both ways.  The
  justification to defend the position of not tying these two I would
  have is so that I can be more strict to newer breakages (i.e. not
  accepting a push that introduce a new breakage by not ignoring with
  receive.fsck.*) while allowing breakages that are already present.
  The justification for the opposite position is to make it more
  convenient to write a consistent policy.  Whichever way is chosen,
  we would want to see the reason left in the log message so that
  people do not have to wonder what the original motivation was when
  they decide if it is a good idea to change this part of the code.
 
  Hmm. I really tried very hard to separate the fsck.* from the receive.*
  settings because the two code paths already behave differently:...
 
  If you agree, I would rephrase this line of argument and add it to the
  commit message. Do you agree?
 
 Yeah, that reasoning sounds sensible.

I added this paragraph:

In the same spirit that `git receive-pack`'s usage of the fsck machinery
differs from `git fsck`'s – some of the non-fatal warnings in `git fsck`
are fatal with `git receive-pack` when receive.fsckObjects = true, for
example – we strictly separate the fsck.* from the receive.fsck.*
settings.

Ciao,
Dscho

Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Of course you can say that! ;-) The problem these ugly messages try to
 solve is to give the user a hint which setting to change if they want to
 override the default behavior, though...

Ahh, OK, then dashed form would not work as a configuration variable
names, so missingTaggerEntry would be the only usable option.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Johannes Schindelin
Hi Junio,

On Mon, 22 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  Of course you can say that! ;-) The problem these ugly messages try to
  solve is to give the user a hint which setting to change if they want to
  override the default behavior, though...
 
 Ahh, OK, then dashed form would not work as a configuration variable
 names, so missingTaggerEntry would be the only usable option.

Except that cunning me has made it so that both missing-tagger-entry *and*
missingTaggerEntry work...

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-22 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Hi Junio,

 On Mon, 22 Dec 2014, Junio C Hamano wrote:

 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
  Of course you can say that! ;-) The problem these ugly messages try to
  solve is to give the user a hint which setting to change if they want to
  override the default behavior, though...
 
 Ahh, OK, then dashed form would not work as a configuration variable
 names, so missingTaggerEntry would be the only usable option.

 Except that cunning me has made it so that both missing-tagger-entry *and*
 missingTaggerEntry work...

Then the missing-tagger-entry side needs to be dropped.  The naming
does not conform to the way how we name our officially supported
configuration variables.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 16/18] fsck: support demoting errors to warnings

2014-12-10 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 We already have support in `git receive-pack` to deal with some legacy
 repositories which have non-fatal issues.

 Let's make `git fsck` itself useful with such repositories, too, by
 allowing users to ignore known issues, or at least demote those issues
 to mere warnings.

 Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
 missing emails in author, committer and tagger lines.

Hopefully I do not have to repeat myself, but please do not have
punctuations in the first and the last level of configuration
variable names, i.e. fsck.missingEmail, not mising-email.

Should these be tied to receive-pack ones in any way?  E.g. if you
set fsck.missingEmail to ignore, you do not have to do the same for
receive and accept a push with the specific error turned off?

Not a rhetorical question.  I can see it argued both ways.  The
justification to defend the position of not tying these two I would
have is so that I can be more strict to newer breakages (i.e. not
accepting a push that introduce a new breakage by not ignoring with
receive.fsck.*) while allowing breakages that are already present.
The justification for the opposite position is to make it more
convenient to write a consistent policy.  Whichever way is chosen,
we would want to see the reason left in the log message so that
people do not have to wonder what the original motivation was when
they decide if it is a good idea to change this part of the code.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 16/18] fsck: support demoting errors to warnings

2014-12-08 Thread Johannes Schindelin
We already have support in `git receive-pack` to deal with some legacy
repositories which have non-fatal issues.

Let's make `git fsck` itself useful with such repositories, too, by
allowing users to ignore known issues, or at least demote those issues
to mere warnings.

Example: `git -c fsck.missing-email=ignore fsck` would hide problems with
missing emails in author, committer and tagger lines.

Signed-off-by: Johannes Schindelin johannes.schinde...@gmx.de
---
 Documentation/config.txt | 13 +
 builtin/fsck.c   | 15 +++
 t/t1450-fsck.sh  | 11 +++
 3 files changed, 39 insertions(+)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index b3276ee..fa58c26 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1192,6 +1192,19 @@ filter.driver.smudge::
object to a worktree file upon checkout.  See
linkgit:gitattributes[5] for details.
 
+fsck.*::
+   With these options, fsck errors can be switched to warnings and
+   vice versa by setting e.g. `fsck.bad-name` to `warn` or `error`
+   (or `ignore` to hide those errors completely). For convenience,
+   fsck prefixes the error/warning with the name of the option, e.g.
+   missing-email: invalid author/committer line - missing email
+   means that setting `fsck.missing-email` to `ignore` will hide that
+   issue.  For convenience, camelCased options are accepted, too (e.g.
+   `fsck.missingEmail`).
++
+This feature is intended to support working with legacy repositories
+which cannot be repaired without disruptive changes.
+
 gc.aggressiveDepth::
The depth parameter used in the delta compression
algorithm used by 'git gc --aggressive'.  This defaults
diff --git a/builtin/fsck.c b/builtin/fsck.c
index 99d4538..2b8faa4 100644
--- a/builtin/fsck.c
+++ b/builtin/fsck.c
@@ -46,6 +46,19 @@ static int show_dangling = 1;
 #define DIRENT_SORT_HINT(de) ((de)-d_ino)
 #endif
 
+static int fsck_config(const char *var, const char *value, void *cb)
+{
+   if (starts_with(var, fsck.)) {
+   struct strbuf sb = STRBUF_INIT;
+   strbuf_addf(sb, %s=%s, var + 5, value ? value : error);
+   fsck_strict_mode(fsck_obj_options, sb.buf);
+   strbuf_release(sb);
+   return 0;
+   }
+
+   return git_default_config(var, value, cb);
+}
+
 static void objreport(struct object *obj, const char *severity,
   const char *err)
 {
@@ -638,6 +651,8 @@ int cmd_fsck(int argc, const char **argv, const char 
*prefix)
include_reflogs = 0;
}
 
+   git_config(fsck_config, NULL);
+
fsck_head_link();
fsck_object_dir(get_object_directory());
 
diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
index 019fddd..d74df19 100755
--- a/t/t1450-fsck.sh
+++ b/t/t1450-fsck.sh
@@ -283,6 +283,17 @@ test_expect_success 'rev-list --verify-objects with bad 
sha1' '
grep -q error: sha1 mismatch 63ff 
out
 '
 
+test_expect_success 'force fsck to ignore double author' '
+   git cat-file commit HEAD basis 
+   sed s/^author .*/,/ basis | tr , \\n multiple-authors 
+   new=$(git hash-object -t commit -w --stdin multiple-authors) 
+   test_when_finished remove_object $new 
+   git update-ref refs/heads/bogus $new 
+   test_when_finished git update-ref -d refs/heads/bogus 
+   test_must_fail git fsck 
+   git -c fsck.multiple-authors=ignore fsck
+'
+
 _bz='\0'
 _bz5=$_bz$_bz$_bz$_bz$_bz
 _bz20=$_bz5$_bz5$_bz5$_bz5
-- 
2.0.0.rc3.9669.g840d1f9

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html