Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-23 Thread Joachim Schmitz

Junio C Hamano wrote:

Jeff King  writes:


...


Not exactly.  There are three classes of people:

- wrote scripts using --get; found out that --get barfs if you feed
  two or more of the same, and have long learned to accept it as a
  limitation and adjusted their configuration files to avoid it.
  They have been doing just fine and wouldn't benefit from this
  series at all.

- wrote scripts using --get, relying on it barfing if fed zero
  (i.e. missing) or two or more (i.e. ambiguous), perhaps a way to
  keep their configuration files (arguably unnecessarily) clean.
  They are directly harmed by this series.

- wrote scripts using --get-all and did the last-one-wins
  themselves.  They wouldn't benefit directly from this series,
  unless they are willing to spend a bit of time to remove their
  own last-one-wins logic and replace --get-all with --get (but the
  same effort is needed to migrate to --get-one).

- wanted to write scripts using --get, but after finding out that
  it barfs if you feed two, gave up emulating the internal, without
  realizing that they could do so with --get-all.  They can now
  write their scripts without using --get-all.


There are three classes ofpeople: those that can count and those that can't

Sorry could not resist ;-) 



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-21 Thread Junio C Hamano
Jeff King  writes:

> So what do we want to do?

Nothing.  We'd just let it graduate along with other topics, and
deal with a case where somebody screams, which is unlikely to happen
;-).
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-21 Thread Jeff King
On Wed, Nov 21, 2012 at 11:46:33AM -0800, Junio C Hamano wrote:

> > The problem is that scripts currently using "--get" are broken by
> > duplicate entries in include files, whereas internal git handles them
> > just fine.  Introducing a new switch means that everybody also has to
> > start using it.
> 
> Not exactly.  There are three classes of people:
> 
>  - wrote scripts using --get; found out that --get barfs if you feed
>two or more of the same, and have long learned to accept it as a
>limitation and adjusted their configuration files to avoid it.
>They have been doing just fine and wouldn't benefit from this
>series at all.
> 
>  - wrote scripts using --get, relying on it barfing if fed zero
>(i.e. missing) or two or more (i.e. ambiguous), perhaps a way to
>keep their configuration files (arguably unnecessarily) clean.
>They are directly harmed by this series.
> 
>  - wrote scripts using --get-all and did the last-one-wins
>themselves.  They wouldn't benefit directly from this series,
>unless they are willing to spend a bit of time to remove their
>own last-one-wins logic and replace --get-all with --get (but the
>same effort is needed to migrate to --get-one).
> 
>  - wanted to write scripts using --get, but after finding out that
>it barfs if you feed two, gave up emulating the internal, without
>realizing that they could do so with --get-all.  They can now
>write their scripts without using --get-all.

I have a feeling that your cases 2 and 4 do not really exist. Because we
did "last one wins" in the case that it mattered (between different
files), it was always "good enough" to just assume that using "--get"
behaved like git did internally, and nobody really noticed or cared that
we did duplicate detection at all. But that is just from my own
perspective; it is not like I did a complete survey of git-config users.

More compelling to me is that the only ones negatively affected are your
case 2, and that is qualified by the "arguably unnecessary" you wrote.

Everyone else is not negatively impacted, and can later move to using
"--get" if they want to give up any home-grown --get-all code.

> >   2. If you are a script that only wants to mimic how get retrieves
> >  a single value, then this is a bug fix, as it brings "--get" in
> >  line with git's internal use.
> 
> But by definition, no such script exists (if it used "--get", it
> didn't mimic the internal in the first place).

They do not exist if we assume that anyone using "--get" carefully
thought about the distinction. But I have the impression that is not the
case, and that they _meant_ to behave just like git, and did not realize
they were not doing so. Even our own scripts are guilty of this.

> Yeah, so the only ones that can be harmed is a config lint that uses
> the --get option with --file to make sure variables they know must
> be single value are indeed so, and they are not doing a thorough
> job, unless they are checking across files themselves, at which
> point they are better off using --get-all piped to "sort | uniq -c"
> or something.  So breakages do not matter much for correctly written
> scripts.

Right.

> > IOW, I do not see a legitimate use case for this, but see it as an extra
> > check on broken config.
> 
> Yes, I agree with the latter half of that sentence.

So what do we want to do?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-21 Thread Junio C Hamano
Jeff King  writes:

>> The way for the Porcelain scripts to mimick the internal "last one
>> wins" has been to grab values out of --get-all and do the "last one
>> wins" themselves, and I agree that a mode that frees them from
>> having to worry about it is a good *addition*.  I would even say
>> that if we were designing "git config" plumbing without existing
>> users, it would be the only sensible behaviour for "--get".
>> 
>> But "git config --get" users have been promised to get errors on
>> duplicate values so far, so I have to say this needs to come with a
>> big red sign about API breakage.
>
> The problem is that scripts currently using "--get" are broken by
> duplicate entries in include files, whereas internal git handles them
> just fine.  Introducing a new switch means that everybody also has to
> start using it.

Not exactly.  There are three classes of people:

 - wrote scripts using --get; found out that --get barfs if you feed
   two or more of the same, and have long learned to accept it as a
   limitation and adjusted their configuration files to avoid it.
   They have been doing just fine and wouldn't benefit from this
   series at all.

 - wrote scripts using --get, relying on it barfing if fed zero
   (i.e. missing) or two or more (i.e. ambiguous), perhaps a way to
   keep their configuration files (arguably unnecessarily) clean.
   They are directly harmed by this series.

 - wrote scripts using --get-all and did the last-one-wins
   themselves.  They wouldn't benefit directly from this series,
   unless they are willing to spend a bit of time to remove their
   own last-one-wins logic and replace --get-all with --get (but the
   same effort is needed to migrate to --get-one).

 - wanted to write scripts using --get, but after finding out that
   it barfs if you feed two, gave up emulating the internal, without
   realizing that they could do so with --get-all.  They can now
   write their scripts without using --get-all.

The only people who benefit are from the last class; it does not
matter if they have to write --get-one or --get.

> That is not an excuse for breakage, of course, but only a motivation for
> considering it. And here is what I think mitigates that breakage:
>
>   1. If you are a script that cares about multiple values and choosing
>  one (whether last-one-wins or otherwise), you are already using
>  --get-all and are not affected.

Correct.

>   2. If you are a script that only wants to mimic how get retrieves
>  a single value, then this is a bug fix, as it brings "--get" in
>  line with git's internal use.

But by definition, no such script exists (if it used "--get", it
didn't mimic the internal in the first place).

>   3. If you are a script that only wants a single value, but cares about
>  duplicates, you are already failing to notice them when the
>  duplicates are cross-file. That is, we already do "last one wins"
>  if an entry is found in ~/.gitconfig and .git/config.

Yeah, so the only ones that can be harmed is a config lint that uses
the --get option with --file to make sure variables they know must
be single value are indeed so, and they are not doing a thorough
job, unless they are checking across files themselves, at which
point they are better off using --get-all piped to "sort | uniq -c"
or something.  So breakages do not matter much for correctly written
scripts.

> I would argue that anybody fetching standard git config variables (and
> not using --list or --get-all) falls into case (2) and is being fixed,
> as they will not otherwise match the behavior of git itself.

As people shove random stuff that git itself does not care about in
their config files, they may not care, though.

> IOW, I do not see a legitimate use case for this, but see it as an extra
> check on broken config.

Yes, I agree with the latter half of that sentence.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-21 Thread Jeff King
On Tue, Nov 20, 2012 at 11:08:43AM -0800, Junio C Hamano wrote:

> > Thanks. It had a few minor flaws (like a memory leak). I fixed those,
> > updated the tests, and split it out into a few more readable commits. In
> > the process, I managed to uncover and fix a few other memory leaks in
> > the area. I think this version is much more readable, and writing the
> > rationale for patch 7 convinced me that it's the right thing to do.
> > Another round of review would be appreciated.
> >
> >   [1/8]: t1300: style updates
> >   [2/8]: t1300: remove redundant test
> >   [3/8]: t1300: test "git config --get-all" more thoroughly
> >   [4/8]: git-config: remove memory leak of key regexp
> >   [5/8]: git-config: fix regexp memory leaks on error conditions
> >   [6/8]: git-config: collect values instead of immediately printing
> >   [7/8]: git-config: do not complain about duplicate entries
> >   [8/8]: git-config: use git_config_with_options
> >
> > For those just joining us, the interesting bit is patch 7, which fixes
> > some inconsistencies between the "git-config" tool and how the internal
> > config callbacks work.
> 
> The way for the Porcelain scripts to mimick the internal "last one
> wins" has been to grab values out of --get-all and do the "last one
> wins" themselves, and I agree that a mode that frees them from
> having to worry about it is a good *addition*.  I would even say
> that if we were designing "git config" plumbing without existing
> users, it would be the only sensible behaviour for "--get".
> 
> But "git config --get" users have been promised to get errors on
> duplicate values so far, so I have to say this needs to come with a
> big red sign about API breakage.

The problem is that scripts currently using "--get" are broken by
duplicate entries in include files, whereas internal git handles them
just fine.  Introducing a new switch means that everybody also has to
start using it.

That is not an excuse for breakage, of course, but only a motivation for
considering it. And here is what I think mitigates that breakage:

  1. If you are a script that cares about multiple values and choosing
 one (whether last-one-wins or otherwise), you are already using
 --get-all and are not affected.

  2. If you are a script that only wants to mimic how get retrieves
 a single value, then this is a bug fix, as it brings "--get" in
 line with git's internal use.

  3. If you are a script that only wants a single value, but cares about
 duplicates, you are already failing to notice them when the
 duplicates are cross-file. That is, we already do "last one wins"
 if an entry is found in ~/.gitconfig and .git/config.

I would argue that anybody fetching standard git config variables (and
not using --list or --get-all) falls into case (2) and is being fixed,
as they will not otherwise match the behavior of git itself.

For other variables that porcelains want to stuff inside the config
files, I suppose they could fall into case (3). But I am not sure why
they would care about duplicates. They have asked git for a single
value, and if they care more deeply about multiple values (but only
within a single file!), what do they hope to accomplish by not calling
--get-all? That is, what is the use case where this makes any sense?

The only case I can think of where the distinction even matters is:

  1. Porcelain foo writes to the .gitfoo file via "git config -f
 .gitfoo".

  2. It accidentally writes using "--add" instead of just setting the
 value.

  3. It fetches via "git config -f .gitfoo --get foo.var". Before my
 patch, duplicate detection would notice the bug in (2) and barf.
 Now, it silently takes the most recently added value (which is
 probably what was meant anyway).

IOW, I do not see a legitimate use case for this, but see it as an extra
check on broken config. But it is catching an unlikely condition, and is
an overly restrictive check, in that it is triggering on totally
reasonable config. So we are not breaking a use case so much as a
loosening a (in my opinion) useless check.

But maybe I am missing some more sane case.

> I would feel safer to introduce --get-one or something now, and
> worry about migration as a separate step.

Part of my impetus for fixing --get is that it let us cleanup and
harmonize the git_config() and git-config implementations. If we are not
going to do that, adding an extra option is not that appealing, as we
are stuck with the old duplicated code, anyway. As I mentioned above,
this only really affects include files (because from git-config's
perspective, the entries it sees are all "the same file", as they come
from the same call into git_config_from_file). If we are not going to
fix --get for all callers, it would probably make more sense to just
omit the duplicate suppression for entries across include-file
boundaries (which are something that callers would have to opt into when
using a specific file, anyway). IOW, to tre

Re: [PATCH 0/8] fix git-config with duplicate variable entries

2012-11-20 Thread Junio C Hamano
Jeff King  writes:

> On Tue, Oct 23, 2012 at 04:13:44PM +0200, Ævar Arnfjörð Bjarmason wrote:
>
>> > It fails a few tests in t1300, but it looks like those tests are testing
>> > for the behavior we have identified as wrong, and should be fixed.
>> 
>> I think this patch looks good.
>
> Thanks. It had a few minor flaws (like a memory leak). I fixed those,
> updated the tests, and split it out into a few more readable commits. In
> the process, I managed to uncover and fix a few other memory leaks in
> the area. I think this version is much more readable, and writing the
> rationale for patch 7 convinced me that it's the right thing to do.
> Another round of review would be appreciated.
>
>   [1/8]: t1300: style updates
>   [2/8]: t1300: remove redundant test
>   [3/8]: t1300: test "git config --get-all" more thoroughly
>   [4/8]: git-config: remove memory leak of key regexp
>   [5/8]: git-config: fix regexp memory leaks on error conditions
>   [6/8]: git-config: collect values instead of immediately printing
>   [7/8]: git-config: do not complain about duplicate entries
>   [8/8]: git-config: use git_config_with_options
>
> For those just joining us, the interesting bit is patch 7, which fixes
> some inconsistencies between the "git-config" tool and how the internal
> config callbacks work.


The way for the Porcelain scripts to mimick the internal "last one
wins" has been to grab values out of --get-all and do the "last one
wins" themselves, and I agree that a mode that frees them from
having to worry about it is a good *addition*.  I would even say
that if we were designing "git config" plumbing without existing
users, it would be the only sensible behaviour for "--get".

But "git config --get" users have been promised to get errors on
duplicate values so far, so I have to say this needs to come with a
big red sign about API breakage.

I would feel safer to introduce --get-one or something now, and
worry about migration as a separate step.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html