Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-04 Thread Johannes Schindelin
Hi Junio,

On 2015-02-04 04:50, Junio C Hamano wrote:
 Johannes Schindelin johannes.schinde...@gmx.de writes:
 
 [fsck level]
 missingAuthor = error

 , which looks funny. level is a constant, so it seems superfluous.
 
 Yes, it is superfluous, but is one way to avoid the ambiguity with
 skiplist.  Structuring it like this would not be so bad, either,
 though.
 
   [fsck]
   error = missingAuthor[, other kinds of errors...]
 
 A small set like {ignore, warn, error} is easily maintainable not to
 conflict with skiplist and others.

But you augmented the case against the {ignore, warn, error} structure with 
this yourself:

 With fsck.error=missingAuthor[,other kinds], you would
 instead have to do a bit more silly post-processing
 
   git config -l | sed -ne '
   /^fsck\./{
   # make it var=,token1,token2,token3,
   s/=/=,/
 s/$/,/
 s/[   ]*//g
   s|^fsck\.\([^=]*\)=.*,missingAuthor,.*|\1|p
   }
   ' | tail -n 1

If a script to determine the current state of affairs is already convoluted, 
how complicated would it be for users of this feature? No, let's go with 
Michael's suggestion and have a much more elegant (read: easy to grasp) 
configuration.

I see that the extra .level. is universally frowned on, and therefore am 
convinced that it would be bad to add it.

As to the conflict with skiplist, by your own argument (small set ... is 
easily maintainable) it is not a problem at all.

 This becomes important when I want to catch obvious problems such as
 fsck.missingautor': if I have an extra '.level', I can be certain that
 it is a typo rather than a config setting unrelated to the severity
 levels.
 
 [fsck] error = missingAutor would let you catch the typo in a similar
 way with the same context clue, so this does not decide which is
 better, either.

No, you are correct, this does not decide it. What decides it that 
`fsck.missingAuthor = error` is better than `fsck.error = missingAuthor` is 
that the user does not need to wrap her head around the significance of the 
order of mutually overriding `error`, `warn` and `ignore` settings in the 
former.

As to the typo handling, you are absolutely correct that it is easy to handle 
skiplist first and then expect all other fsck.* settings (or receive.fsck.* 
settings) to match a message ID. To make things even smoother, I think it would 
make most sense to only *warn* about typos instead of *erroring out*.

 I wonder if
 
   [fsckError]
   missingAuthor = error
 missingTagger = warn

Hmm. I am not so sure that `fsckError` is the correct term. After all, we can 
also upgrade warnings to errors, not only demote errors to warnings. So I guess 
the `fsck.*` naming scheme is not so bad after all!

 Unfortunately, I have to pass the `receive.fsck.*` settings from
 `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
 command-line, because it is `git-receive-pack` that consumes the config
 setting, but it is one of `git-unpack-objects` and `git-index-pack` that
 has to act on it...
 
 But receive-pack at some point decides what, if anything, needs to
 be passed when invoking unpack-objects, or index-pack, no?  Why is
 it hard to pass -c var=val at the beginning where it would have
 passed --strictness=var=val at the end?

It is not hard, it is just inelegant: 1) we would have to introduce a new 
function to handle the config settings in `unpack-objects`, 2) the command-line 
would be longer (for what benefit?), and 3) index-pack uses run_command() to 
launch the children [*1*] and wouldn't you find it super-ugly if that argv 
started with a -c, ...? I would.

We already have command-line handling in both index-pack and unpack-objects 
(and config handling only in the former), we already have --strict handling in 
particular, and arguably the severity levels are tightly connected to that 
--strict option that at least in my opinion it makes sense to understand that 
the severity levels are optional parameters to that command-line option.

So in short, I maintain that passing the options via the config mechanism would 
just make things more complicated, and in no way better.

 Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
 mechanism? If I run

 git -c foo.bar=baz $CMD

 , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
 'foo.bar=baz', which causes child processes to treat that value as a
 configuration setting. I don't have a lot of experience with this but I
 think it should do what you need.

 This is true, but please remember that the receive.fsck.* settings
 should be heeded by index-pack/unpack-objects *only* if one of the
 latter programs is called by receive-pack. It would therefore be a
 little funny (or wrong, depending on your point of view) if, say,
 index-pack would respect the receive.fsck.* settings.
 
 That 

Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-03 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 [fsck level]
 missingAuthor = error
 
 , which looks funny. level is a constant, so it seems superfluous.

Yes, it is superfluous, but is one way to avoid the ambiguity with
skiplist.  Structuring it like this would not be so bad, either,
though.

[fsck]
error = missingAuthor[, other kinds of errors...]

A small set like {ignore, warn, error} is easily maintainable not to
conflict with skiplist and others.

So that avoid ambiguity with skiplist does not favor either choice
in any significant way.

 This becomes important when I want to catch obvious problems such as
 fsck.missingautor': if I have an extra '.level', I can be certain that
 it is a typo rather than a config setting unrelated to the severity
 levels.

[fsck] error = missingAutor would let you catch the typo in a similar
way with the same context clue, so this does not decide which is
better, either.

One clear benefit I can see in it is that you can do

git config fsck.level.missingAuthor

in scripts that wants to learn the current setting for a single
variable.  With fsck.error=missingAuthor[,other kinds], you would
instead have to do a bit more silly post-processing

git config -l | sed -ne '
/^fsck\./{
# make it var=,token1,token2,token3,
s/=/=,/
s/$/,/
s/[ ]*//g
s|^fsck\.\([^=]*\)=.*,missingAuthor,.*|\1|p
}
' | tail -n 1

to grab the last fsck.{error,ignore,...}= thing that has the token
(I personally do not think the latter is so bad, though).

I wonder if

[fsckError]
missingAuthor = error
missingTagger = warn

wouldn't be a better way, though.  We'd keep the easier scripting

git config fsckError.missingTagger

There is nothing that says that the top-level grouping must match
the Git subcommand name.  Nothing says that one Git subcommand can
own at most one namespace, either.  Nothing stops us from reserving
fsckError top-level namespace for variable name collision avoidance
with other fsck.* variables, if that gives us a better system.

 Unfortunately, I have to pass the `receive.fsck.*` settings from
 `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
 command-line, because it is `git-receive-pack` that consumes the config
 setting, but it is one of `git-unpack-objects` and `git-index-pack` that
 has to act on it...

But receive-pack at some point decides what, if anything, needs to
be passed when invoking unpack-objects, or index-pack, no?  Why is
it hard to pass -c var=val at the beginning where it would have
passed --strictness=var=val at the end?

 Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
 mechanism? If I run
 
 git -c foo.bar=baz $CMD
 
 , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
 'foo.bar=baz', which causes child processes to treat that value as a
 configuration setting. I don't have a lot of experience with this but I
 think it should do what you need.

 This is true, but please remember that the receive.fsck.* settings
 should be heeded by index-pack/unpack-objects *only* if one of the
 latter programs is called by receive-pack. It would therefore be a
 little funny (or wrong, depending on your point of view) if, say,
 index-pack would respect the receive.fsck.* settings.

That means it would be fine if receive-pack invokes (when it sees
receive.fsck.severity=missingAuthor=error,missingTagger=warn config
meant for it and was told with receive.fsckObjects to check the
incoming objects) a command line like this:

git -c fsckError.missingAuthor=error \
-c fsckError.missingTagger=warn \
index-pack $args...

(or whatever variable names and name structure we settle on).  And
the index-pack command does not have to even know there are
receive.fsck.* variables at all, no?

Another way to do that may be for receive-pack to invoke

git index-pack --use-fsck-severity=receive.fsck $args...

to instruct it to look at receive.fsck.* variables, again when and
only when receive-pack wants to do so.  I think either way would be
fine, as this communication is an internal implementation detail
between receive-pack and index-pack and is not meant to be exposed
to the end users anyway.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-03 Thread Michael Haggerty
On 02/02/2015 05:48 PM, Johannes Schindelin wrote:
 On 2015-02-02 13:43, Michael Haggerty wrote:
 On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
 Hi all (in particular Junio),

 On 2015-01-31 22:04, Johannes Schindelin wrote:

 [...] switch to fsck.severity to address Michael's concerns that
 letting fsck.(error|warn|ignore)'s comma-separated lists possibly
 overriding each other partially;

 Having participated in the CodingStyle thread, I came to the
 conclusion that the fsck.severity solution favors syntax over
 intuitiveness.

 Therefore, I would like to support the case for
 `fsck.level.missingAuthor` (note that there is an extra .level. in
 contrast to earlier suggestions).

 Why level?
 
 Severity level, or error level. Maybe .severity. would be better?

Sorry, I should have been clearer. I understand why the word level
makes sense, as opposed to, say, peanut-butter. What I don't
understand is why a middle word is needed at all. In the config file it
will look like

[fsck level]
missingAuthor = error

, which looks funny. level is a constant, so it seems superfluous.

If anything, it might be more useful to allow an optional middle word to
allow the strictness level to be adjusted based on which command
encounters the problem. For example, if you want to tolerate existing
commits that have missing authors, but not allow any new ones to be
pushed, you could set

[strictness]
missingAuthor = ignore
[strictness receive-pack]
missingAuthor = error

(There's probably a better word than strictness, but you get the idea.)

 The benefits:

 - it is very, very easy to understand

 - cumulative settings are intuitively cumulative, i.e. setting
 `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
 completely unaffected

 - it is very easy to enquire and set the levels via existing `git
 config` calls

 Now, there is one downside, but *only* if we ignore Postel's law.

 Postel's law (be lenient in what you accept as input, but strict in
 your output) would dictate that our message ID parser accept both
 missing-author and missingAuthor if we follow the inconsistent
 practice of using lowercase-dashed keys on the command-line but
 CamelCased ones in the config.

 However, earlier Junio made very clear that the parser is required to
 fail to parse missing-author in the config, and to fail to parse
 missingAuthor on the command-line.

 Therefore, the design I recommend above will require two, minimally
 different parsers for essentially the same thing.

 IMHO this is a downside that is by far outweighed by the ease of use
 of the new feature, therefore I am willing to bear the burden of
 implementation.

 I again encourage you to consider skipping the implementation of
 command-line options entirely. It's not like users are going to want to
 use different options for different invocations. Let them use

 git -c fsck.level.missingAuthor=ignore fsck

 if they really want to play around, then

 git config fsck.level.missingAuthor ignore

 to make it permanent. After that they will never have to worry about
 that option again.
 
 Unfortunately, I have to pass the `receive.fsck.*` settings from
 `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
 command-line, because it is `git-receive-pack` that consumes the config
 setting, but it is one of `git-unpack-objects` and `git-index-pack` that
 has to act on it...

Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
mechanism? If I run

git -c foo.bar=baz $CMD

, then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
'foo.bar=baz', which causes child processes to treat that value as a
configuration setting. I don't have a lot of experience with this but I
think it should do what you need.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-03 Thread Johannes Schindelin
Hi Michael,

On 2015-02-03 16:11, Michael Haggerty wrote:
 On 02/02/2015 05:48 PM, Johannes Schindelin wrote:
 On 2015-02-02 13:43, Michael Haggerty wrote:
 On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
 Hi all (in particular Junio),

 On 2015-01-31 22:04, Johannes Schindelin wrote:

 [...] switch to fsck.severity to address Michael's concerns that
 letting fsck.(error|warn|ignore)'s comma-separated lists possibly
 overriding each other partially;

 Having participated in the CodingStyle thread, I came to the
 conclusion that the fsck.severity solution favors syntax over
 intuitiveness.

 Therefore, I would like to support the case for
 `fsck.level.missingAuthor` (note that there is an extra .level. in
 contrast to earlier suggestions).

 Why level?

 Severity level, or error level. Maybe .severity. would be better?
 
 Sorry, I should have been clearer. I understand why the word level
 makes sense, as opposed to, say, peanut-butter. What I don't
 understand is why a middle word is needed at all. In the config file it
 will look like
 
 [fsck level]
 missingAuthor = error
 
 , which looks funny. level is a constant, so it seems superfluous.
 
 If anything, it might be more useful to allow an optional middle word to
 allow the strictness level to be adjusted based on which command
 encounters the problem. For example, if you want to tolerate existing
 commits that have missing authors, but not allow any new ones to be
 pushed, you could set
 
 [strictness]
 missingAuthor = ignore
 [strictness receive-pack]
 missingAuthor = error
 
 (There's probably a better word than strictness, but you get the idea.)

Ah. Well, the idea of the middle constant is to separate the severity levels 
from all other fsck (or receive.fsck) settings. The 'fsck.skiplist' setting 
that I introduce in this patch series, for example, looks pretty much the same 
as 'fsck.missingauthor', but they have different roles.

This becomes important when I want to catch obvious problems such as 
'fsck.missingautor': if I have an extra '.level', I can be certain that it is a 
typo rather than a config setting unrelated to the severity levels.

 The benefits:

 - it is very, very easy to understand

 - cumulative settings are intuitively cumulative, i.e. setting
 `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
 completely unaffected

 - it is very easy to enquire and set the levels via existing `git
 config` calls

 Now, there is one downside, but *only* if we ignore Postel's law.

 Postel's law (be lenient in what you accept as input, but strict in
 your output) would dictate that our message ID parser accept both
 missing-author and missingAuthor if we follow the inconsistent
 practice of using lowercase-dashed keys on the command-line but
 CamelCased ones in the config.

 However, earlier Junio made very clear that the parser is required to
 fail to parse missing-author in the config, and to fail to parse
 missingAuthor on the command-line.

 Therefore, the design I recommend above will require two, minimally
 different parsers for essentially the same thing.

 IMHO this is a downside that is by far outweighed by the ease of use
 of the new feature, therefore I am willing to bear the burden of
 implementation.

 I again encourage you to consider skipping the implementation of
 command-line options entirely. It's not like users are going to want to
 use different options for different invocations. Let them use

 git -c fsck.level.missingAuthor=ignore fsck

 if they really want to play around, then

 git config fsck.level.missingAuthor ignore

 to make it permanent. After that they will never have to worry about
 that option again.

 Unfortunately, I have to pass the `receive.fsck.*` settings from
 `git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the
 command-line, because it is `git-receive-pack` that consumes the config
 setting, but it is one of `git-unpack-objects` and `git-index-pack` that
 has to act on it...
 
 Wouldn't that work automatically via the GIT_CONFIG_PARAMETERS
 mechanism? If I run
 
 git -c foo.bar=baz $CMD
 
 , then git-$CMD is invoked with GIT_CONFIG_PARAMETERS set to
 'foo.bar=baz', which causes child processes to treat that value as a
 configuration setting. I don't have a lot of experience with this but I
 think it should do what you need.

This is true, but please remember that the receive.fsck.* settings should be 
heeded by index-pack/unpack-objects *only* if one of the latter programs is 
called by receive-pack. It would therefore be a little funny (or wrong, 
depending on your point of view) if, say, index-pack would respect the 
receive.fsck.* settings.

That is why receive-pack adds a `--strict` command-line option when 
receive.fsckobjects is set to true instead of letting index-pack (or 
unpack-objects) look at the config variable receive.fsckobjects itself.

In the same spirit, I extend the `--strict` command-line option with an 

Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-02 Thread Michael Haggerty
On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
 Hi all (in particular Junio),
 
 On 2015-01-31 22:04, Johannes Schindelin wrote:
 
 [...] switch to fsck.severity to address Michael's concerns that
 letting fsck.(error|warn|ignore)'s comma-separated lists possibly
 overriding each other partially;
 
 Having participated in the CodingStyle thread, I came to the
 conclusion that the fsck.severity solution favors syntax over
 intuitiveness.
 
 Therefore, I would like to support the case for
 `fsck.level.missingAuthor` (note that there is an extra .level. in
 contrast to earlier suggestions).

Why level?

 The benefits:
 
 - it is very, very easy to understand
 
 - cumulative settings are intuitively cumulative, i.e. setting
 `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
 completely unaffected
 
 - it is very easy to enquire and set the levels via existing `git
 config` calls
 
 Now, there is one downside, but *only* if we ignore Postel's law.
 
 Postel's law (be lenient in what you accept as input, but strict in
 your output) would dictate that our message ID parser accept both
 missing-author and missingAuthor if we follow the inconsistent
 practice of using lowercase-dashed keys on the command-line but
 CamelCased ones in the config.
 
 However, earlier Junio made very clear that the parser is required to
 fail to parse missing-author in the config, and to fail to parse
 missingAuthor on the command-line.
 
 Therefore, the design I recommend above will require two, minimally
 different parsers for essentially the same thing.
 
 IMHO this is a downside that is by far outweighed by the ease of use
 of the new feature, therefore I am willing to bear the burden of
 implementation.

I again encourage you to consider skipping the implementation of
command-line options entirely. It's not like users are going to want to
use different options for different invocations. Let them use

git -c fsck.level.missingAuthor=ignore fsck

if they really want to play around, then

git config fsck.level.missingAuthor ignore

to make it permanent. After that they will never have to worry about
that option again.

And Postel needn't be offended :-)

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-02 Thread Johannes Schindelin
Hi all (in particular Junio),

On 2015-01-31 22:04, Johannes Schindelin wrote:

 [...] switch to fsck.severity to address Michael's
 concerns that letting fsck.(error|warn|ignore)'s comma-separated lists
 possibly overriding each other partially;

Having participated in the CodingStyle thread, I came to the conclusion that 
the fsck.severity solution favors syntax over intuitiveness.

Therefore, I would like to support the case for `fsck.level.missingAuthor` 
(note that there is an extra .level. in contrast to earlier suggestions).

The benefits:

- it is very, very easy to understand

- cumulative settings are intuitively cumulative, i.e. setting 
`fsck.level.missingAuthor` will leave `fsck.level.invalidEmail` completely 
unaffected

- it is very easy to enquire and set the levels via existing `git config` calls

Now, there is one downside, but *only* if we ignore Postel's law.

Postel's law (be lenient in what you accept as input, but strict in your 
output) would dictate that our message ID parser accept both missing-author 
and missingAuthor if we follow the inconsistent practice of using 
lowercase-dashed keys on the command-line but CamelCased ones in the config.

However, earlier Junio made very clear that the parser is required to fail to 
parse missing-author in the config, and to fail to parse missingAuthor on 
the command-line.

Therefore, the design I recommend above will require two, minimally different 
parsers for essentially the same thing.

IMHO this is a downside that is by far outweighed by the ease of use of the new 
feature, therefore I am willing to bear the burden of implementation.

Do you agree?

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 00/19] Introduce an internal API to interact with the fsck machinery

2015-02-02 Thread Johannes Schindelin
Hi Michael,

On 2015-02-02 13:43, Michael Haggerty wrote:
 On 02/02/2015 12:41 PM, Johannes Schindelin wrote:
 Hi all (in particular Junio),

 On 2015-01-31 22:04, Johannes Schindelin wrote:

 [...] switch to fsck.severity to address Michael's concerns that
 letting fsck.(error|warn|ignore)'s comma-separated lists possibly
 overriding each other partially;

 Having participated in the CodingStyle thread, I came to the
 conclusion that the fsck.severity solution favors syntax over
 intuitiveness.

 Therefore, I would like to support the case for
 `fsck.level.missingAuthor` (note that there is an extra .level. in
 contrast to earlier suggestions).
 
 Why level?

Severity level, or error level. Maybe .severity. would be better?

 The benefits:

 - it is very, very easy to understand

 - cumulative settings are intuitively cumulative, i.e. setting
 `fsck.level.missingAuthor` will leave `fsck.level.invalidEmail`
 completely unaffected

 - it is very easy to enquire and set the levels via existing `git
 config` calls

 Now, there is one downside, but *only* if we ignore Postel's law.

 Postel's law (be lenient in what you accept as input, but strict in
 your output) would dictate that our message ID parser accept both
 missing-author and missingAuthor if we follow the inconsistent
 practice of using lowercase-dashed keys on the command-line but
 CamelCased ones in the config.

 However, earlier Junio made very clear that the parser is required to
 fail to parse missing-author in the config, and to fail to parse
 missingAuthor on the command-line.

 Therefore, the design I recommend above will require two, minimally
 different parsers for essentially the same thing.

 IMHO this is a downside that is by far outweighed by the ease of use
 of the new feature, therefore I am willing to bear the burden of
 implementation.
 
 I again encourage you to consider skipping the implementation of
 command-line options entirely. It's not like users are going to want to
 use different options for different invocations. Let them use
 
 git -c fsck.level.missingAuthor=ignore fsck
 
 if they really want to play around, then
 
 git config fsck.level.missingAuthor ignore
 
 to make it permanent. After that they will never have to worry about
 that option again.

Unfortunately, I have to pass the `receive.fsck.*` settings from 
`git-receive-pack` to `git-unpack-objects` or `git-index-pack` via the 
command-line, because it is `git-receive-pack` that consumes the config 
setting, but it is one of `git-unpack-objects` and `git-index-pack` that has to 
act on it...

 And Postel needn't be offended :-)

;-)

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html