Re: [PATCH 0/8] fix git-config with duplicate variable entries
Junio C Hamano wrote: Jeff King writes: ... Not exactly. There are three classes of people: - wrote scripts using --get; found out that --get barfs if you feed two or more of the same, and have long learned to accept it as a limitation and adjusted their configuration files to avoid it. They have been doing just fine and wouldn't benefit from this series at all. - wrote scripts using --get, relying on it barfing if fed zero (i.e. missing) or two or more (i.e. ambiguous), perhaps a way to keep their configuration files (arguably unnecessarily) clean. They are directly harmed by this series. - wrote scripts using --get-all and did the last-one-wins themselves. They wouldn't benefit directly from this series, unless they are willing to spend a bit of time to remove their own last-one-wins logic and replace --get-all with --get (but the same effort is needed to migrate to --get-one). - wanted to write scripts using --get, but after finding out that it barfs if you feed two, gave up emulating the internal, without realizing that they could do so with --get-all. They can now write their scripts without using --get-all. There are three classes ofpeople: those that can count and those that can't Sorry could not resist ;-) -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] fix git-config with duplicate variable entries
Jeff King writes: > So what do we want to do? Nothing. We'd just let it graduate along with other topics, and deal with a case where somebody screams, which is unlikely to happen ;-). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] fix git-config with duplicate variable entries
On Wed, Nov 21, 2012 at 11:46:33AM -0800, Junio C Hamano wrote: > > The problem is that scripts currently using "--get" are broken by > > duplicate entries in include files, whereas internal git handles them > > just fine. Introducing a new switch means that everybody also has to > > start using it. > > Not exactly. There are three classes of people: > > - wrote scripts using --get; found out that --get barfs if you feed >two or more of the same, and have long learned to accept it as a >limitation and adjusted their configuration files to avoid it. >They have been doing just fine and wouldn't benefit from this >series at all. > > - wrote scripts using --get, relying on it barfing if fed zero >(i.e. missing) or two or more (i.e. ambiguous), perhaps a way to >keep their configuration files (arguably unnecessarily) clean. >They are directly harmed by this series. > > - wrote scripts using --get-all and did the last-one-wins >themselves. They wouldn't benefit directly from this series, >unless they are willing to spend a bit of time to remove their >own last-one-wins logic and replace --get-all with --get (but the >same effort is needed to migrate to --get-one). > > - wanted to write scripts using --get, but after finding out that >it barfs if you feed two, gave up emulating the internal, without >realizing that they could do so with --get-all. They can now >write their scripts without using --get-all. I have a feeling that your cases 2 and 4 do not really exist. Because we did "last one wins" in the case that it mattered (between different files), it was always "good enough" to just assume that using "--get" behaved like git did internally, and nobody really noticed or cared that we did duplicate detection at all. But that is just from my own perspective; it is not like I did a complete survey of git-config users. More compelling to me is that the only ones negatively affected are your case 2, and that is qualified by the "arguably unnecessary" you wrote. Everyone else is not negatively impacted, and can later move to using "--get" if they want to give up any home-grown --get-all code. > > 2. If you are a script that only wants to mimic how get retrieves > > a single value, then this is a bug fix, as it brings "--get" in > > line with git's internal use. > > But by definition, no such script exists (if it used "--get", it > didn't mimic the internal in the first place). They do not exist if we assume that anyone using "--get" carefully thought about the distinction. But I have the impression that is not the case, and that they _meant_ to behave just like git, and did not realize they were not doing so. Even our own scripts are guilty of this. > Yeah, so the only ones that can be harmed is a config lint that uses > the --get option with --file to make sure variables they know must > be single value are indeed so, and they are not doing a thorough > job, unless they are checking across files themselves, at which > point they are better off using --get-all piped to "sort | uniq -c" > or something. So breakages do not matter much for correctly written > scripts. Right. > > IOW, I do not see a legitimate use case for this, but see it as an extra > > check on broken config. > > Yes, I agree with the latter half of that sentence. So what do we want to do? -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] fix git-config with duplicate variable entries
Jeff King writes: >> The way for the Porcelain scripts to mimick the internal "last one >> wins" has been to grab values out of --get-all and do the "last one >> wins" themselves, and I agree that a mode that frees them from >> having to worry about it is a good *addition*. I would even say >> that if we were designing "git config" plumbing without existing >> users, it would be the only sensible behaviour for "--get". >> >> But "git config --get" users have been promised to get errors on >> duplicate values so far, so I have to say this needs to come with a >> big red sign about API breakage. > > The problem is that scripts currently using "--get" are broken by > duplicate entries in include files, whereas internal git handles them > just fine. Introducing a new switch means that everybody also has to > start using it. Not exactly. There are three classes of people: - wrote scripts using --get; found out that --get barfs if you feed two or more of the same, and have long learned to accept it as a limitation and adjusted their configuration files to avoid it. They have been doing just fine and wouldn't benefit from this series at all. - wrote scripts using --get, relying on it barfing if fed zero (i.e. missing) or two or more (i.e. ambiguous), perhaps a way to keep their configuration files (arguably unnecessarily) clean. They are directly harmed by this series. - wrote scripts using --get-all and did the last-one-wins themselves. They wouldn't benefit directly from this series, unless they are willing to spend a bit of time to remove their own last-one-wins logic and replace --get-all with --get (but the same effort is needed to migrate to --get-one). - wanted to write scripts using --get, but after finding out that it barfs if you feed two, gave up emulating the internal, without realizing that they could do so with --get-all. They can now write their scripts without using --get-all. The only people who benefit are from the last class; it does not matter if they have to write --get-one or --get. > That is not an excuse for breakage, of course, but only a motivation for > considering it. And here is what I think mitigates that breakage: > > 1. If you are a script that cares about multiple values and choosing > one (whether last-one-wins or otherwise), you are already using > --get-all and are not affected. Correct. > 2. If you are a script that only wants to mimic how get retrieves > a single value, then this is a bug fix, as it brings "--get" in > line with git's internal use. But by definition, no such script exists (if it used "--get", it didn't mimic the internal in the first place). > 3. If you are a script that only wants a single value, but cares about > duplicates, you are already failing to notice them when the > duplicates are cross-file. That is, we already do "last one wins" > if an entry is found in ~/.gitconfig and .git/config. Yeah, so the only ones that can be harmed is a config lint that uses the --get option with --file to make sure variables they know must be single value are indeed so, and they are not doing a thorough job, unless they are checking across files themselves, at which point they are better off using --get-all piped to "sort | uniq -c" or something. So breakages do not matter much for correctly written scripts. > I would argue that anybody fetching standard git config variables (and > not using --list or --get-all) falls into case (2) and is being fixed, > as they will not otherwise match the behavior of git itself. As people shove random stuff that git itself does not care about in their config files, they may not care, though. > IOW, I do not see a legitimate use case for this, but see it as an extra > check on broken config. Yes, I agree with the latter half of that sentence. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 0/8] fix git-config with duplicate variable entries
On Tue, Nov 20, 2012 at 11:08:43AM -0800, Junio C Hamano wrote: > > Thanks. It had a few minor flaws (like a memory leak). I fixed those, > > updated the tests, and split it out into a few more readable commits. In > > the process, I managed to uncover and fix a few other memory leaks in > > the area. I think this version is much more readable, and writing the > > rationale for patch 7 convinced me that it's the right thing to do. > > Another round of review would be appreciated. > > > > [1/8]: t1300: style updates > > [2/8]: t1300: remove redundant test > > [3/8]: t1300: test "git config --get-all" more thoroughly > > [4/8]: git-config: remove memory leak of key regexp > > [5/8]: git-config: fix regexp memory leaks on error conditions > > [6/8]: git-config: collect values instead of immediately printing > > [7/8]: git-config: do not complain about duplicate entries > > [8/8]: git-config: use git_config_with_options > > > > For those just joining us, the interesting bit is patch 7, which fixes > > some inconsistencies between the "git-config" tool and how the internal > > config callbacks work. > > The way for the Porcelain scripts to mimick the internal "last one > wins" has been to grab values out of --get-all and do the "last one > wins" themselves, and I agree that a mode that frees them from > having to worry about it is a good *addition*. I would even say > that if we were designing "git config" plumbing without existing > users, it would be the only sensible behaviour for "--get". > > But "git config --get" users have been promised to get errors on > duplicate values so far, so I have to say this needs to come with a > big red sign about API breakage. The problem is that scripts currently using "--get" are broken by duplicate entries in include files, whereas internal git handles them just fine. Introducing a new switch means that everybody also has to start using it. That is not an excuse for breakage, of course, but only a motivation for considering it. And here is what I think mitigates that breakage: 1. If you are a script that cares about multiple values and choosing one (whether last-one-wins or otherwise), you are already using --get-all and are not affected. 2. If you are a script that only wants to mimic how get retrieves a single value, then this is a bug fix, as it brings "--get" in line with git's internal use. 3. If you are a script that only wants a single value, but cares about duplicates, you are already failing to notice them when the duplicates are cross-file. That is, we already do "last one wins" if an entry is found in ~/.gitconfig and .git/config. I would argue that anybody fetching standard git config variables (and not using --list or --get-all) falls into case (2) and is being fixed, as they will not otherwise match the behavior of git itself. For other variables that porcelains want to stuff inside the config files, I suppose they could fall into case (3). But I am not sure why they would care about duplicates. They have asked git for a single value, and if they care more deeply about multiple values (but only within a single file!), what do they hope to accomplish by not calling --get-all? That is, what is the use case where this makes any sense? The only case I can think of where the distinction even matters is: 1. Porcelain foo writes to the .gitfoo file via "git config -f .gitfoo". 2. It accidentally writes using "--add" instead of just setting the value. 3. It fetches via "git config -f .gitfoo --get foo.var". Before my patch, duplicate detection would notice the bug in (2) and barf. Now, it silently takes the most recently added value (which is probably what was meant anyway). IOW, I do not see a legitimate use case for this, but see it as an extra check on broken config. But it is catching an unlikely condition, and is an overly restrictive check, in that it is triggering on totally reasonable config. So we are not breaking a use case so much as a loosening a (in my opinion) useless check. But maybe I am missing some more sane case. > I would feel safer to introduce --get-one or something now, and > worry about migration as a separate step. Part of my impetus for fixing --get is that it let us cleanup and harmonize the git_config() and git-config implementations. If we are not going to do that, adding an extra option is not that appealing, as we are stuck with the old duplicated code, anyway. As I mentioned above, this only really affects include files (because from git-config's perspective, the entries it sees are all "the same file", as they come from the same call into git_config_from_file). If we are not going to fix --get for all callers, it would probably make more sense to just omit the duplicate suppression for entries across include-file boundaries (which are something that callers would have to opt into when using a specific file, anyway). IOW, to tre
Re: [PATCH 0/8] fix git-config with duplicate variable entries
Jeff King writes: > On Tue, Oct 23, 2012 at 04:13:44PM +0200, Ævar Arnfjörð Bjarmason wrote: > >> > It fails a few tests in t1300, but it looks like those tests are testing >> > for the behavior we have identified as wrong, and should be fixed. >> >> I think this patch looks good. > > Thanks. It had a few minor flaws (like a memory leak). I fixed those, > updated the tests, and split it out into a few more readable commits. In > the process, I managed to uncover and fix a few other memory leaks in > the area. I think this version is much more readable, and writing the > rationale for patch 7 convinced me that it's the right thing to do. > Another round of review would be appreciated. > > [1/8]: t1300: style updates > [2/8]: t1300: remove redundant test > [3/8]: t1300: test "git config --get-all" more thoroughly > [4/8]: git-config: remove memory leak of key regexp > [5/8]: git-config: fix regexp memory leaks on error conditions > [6/8]: git-config: collect values instead of immediately printing > [7/8]: git-config: do not complain about duplicate entries > [8/8]: git-config: use git_config_with_options > > For those just joining us, the interesting bit is patch 7, which fixes > some inconsistencies between the "git-config" tool and how the internal > config callbacks work. The way for the Porcelain scripts to mimick the internal "last one wins" has been to grab values out of --get-all and do the "last one wins" themselves, and I agree that a mode that frees them from having to worry about it is a good *addition*. I would even say that if we were designing "git config" plumbing without existing users, it would be the only sensible behaviour for "--get". But "git config --get" users have been promised to get errors on duplicate values so far, so I have to say this needs to come with a big red sign about API breakage. I would feel safer to introduce --get-one or something now, and worry about migration as a separate step. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html