Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On 12/06, Jeff King wrote: > On Tue, Dec 06, 2016 at 10:22:21AM -0800, Stefan Beller wrote: > > > >> Maybe even go a step further and say that the config code needs a context > > >> "object". > > > > > > If I were writing git from scratch, I'd consider making a "struct > > > repository" object. I'm not sure how painful it would be to retro-fit it > > > at this point. > > > > Would it be possible to introduce "the repo" struct similar to "the index" > > in cache.h? > > > > From a submodule perspective I would very much welcome this > > object oriented approach to repositories. > > I think it may be more complicated, because there's some implicit global > state in "the repo", like where files are relative to our cwd. All of > those low-level functions would have to start caring about which repo > we're talking about so they can prefix the appropriate working tree > path, etc. > > For some operations that would be fine, but there are things that would > subtly fail for submodules. I'm thinking we'd end up with some code > state like: > > /* finding a repo does not modify global state; good */ > struct repository *repo = repo_discover("."); > > /* obvious repo-level operations like looking up refs can be done with >* a repository object; good */ > repo_for_each_ref(repo, callback, NULL); > > /* >* "enter" the repo so that we are at the top-level of the working >* tree, etc. After this you can actually look at the index without >* things breaking. >*/ > repo_enter(repo); > > That would be enough to implement a lot of submodule-level stuff, but it > would break pretty subtly as soon as you asked the submodule about its > working tree. The solution is to make everything that accesses the > working tree aware of the idea of a working tree root besides the cwd. > But that's a pretty invasive change. > > -Peff Some other challenges would be how to address people setting environment variables like GIT_DIR that indicate the location of a repositories git directory, which wouldn't work if you have multiple repos open. I do agree that having a repo object of some sort would aid in simplifying submodule operations but may require too many invasive changes to basic low-level functions. -- Brandon Williams
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Tue, Dec 06, 2016 at 10:22:21AM -0800, Stefan Beller wrote: > >> Maybe even go a step further and say that the config code needs a context > >> "object". > > > > If I were writing git from scratch, I'd consider making a "struct > > repository" object. I'm not sure how painful it would be to retro-fit it > > at this point. > > Would it be possible to introduce "the repo" struct similar to "the index" > in cache.h? > > From a submodule perspective I would very much welcome this > object oriented approach to repositories. I think it may be more complicated, because there's some implicit global state in "the repo", like where files are relative to our cwd. All of those low-level functions would have to start caring about which repo we're talking about so they can prefix the appropriate working tree path, etc. For some operations that would be fine, but there are things that would subtly fail for submodules. I'm thinking we'd end up with some code state like: /* finding a repo does not modify global state; good */ struct repository *repo = repo_discover("."); /* obvious repo-level operations like looking up refs can be done with * a repository object; good */ repo_for_each_ref(repo, callback, NULL); /* * "enter" the repo so that we are at the top-level of the working * tree, etc. After this you can actually look at the index without * things breaking. */ repo_enter(repo); That would be enough to implement a lot of submodule-level stuff, but it would break pretty subtly as soon as you asked the submodule about its working tree. The solution is to make everything that accesses the working tree aware of the idea of a working tree root besides the cwd. But that's a pretty invasive change. -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Tue, Dec 6, 2016 at 7:09 AM, Jeff King wrote: >> >> Maybe even go a step further and say that the config code needs a context >> "object". > > If I were writing git from scratch, I'd consider making a "struct > repository" object. I'm not sure how painful it would be to retro-fit it > at this point. Would it be possible to introduce "the repo" struct similar to "the index" in cache.h? >From a submodule perspective I would very much welcome this object oriented approach to repositories.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Tue, Dec 06, 2016 at 03:48:38PM +0100, Johannes Schindelin wrote: > > Should it blindly look at ".git/config"? > > Absolutely not, of course. You did not need me to say that. > > > Now your program behaves differently depending on whether you are in the > > top-level of the working tree. > > Exactly. This, BTW, is already how the code would behave if anybody called > `git_path()` before `setup_git_directory()`, as the former function > implicitly calls `setup_git_env()` which does *not* call > `setup_git_directory()` but *does* set up `git_dir` which is then used by > `do_git_config_sequence()`.. > > We have a few of these nasty surprises in our code base, where code > silently assumes that global state is set up correctly, and succeeds in > sometimes surprising ways if it is not. Right. I have been working on fixing this. v2.11 has a ton of tweaks in this area, and my patch to die() rather than default to ".git" is cooking in next to catch any stragglers. > > Should it speculatively do repo discovery, and use any discovered config > > file? > > Personally, I find the way we discover the repository most irritating. It > seems that we have multiple, mutually incompatible code paths > (`setup_git_directory()` and `setup_git_env()` come to mind already, and > it does not help that consecutive calls to `setup_git_directory()` will > yield a very unexpected outcome). I agree. We should be killing off setup_git_env(), which is something I've been slowly working towards over the years. There are some annoyances with setup_git_directory(), too (like the fact that you cannot ask "is there a git repository you can find" without making un-recoverable changes to the process state). I think we should fix those, too. > > Now some commands respect config that they shouldn't (e.g., running "git > > init foo.git" from inside another repository will accidentally pick up > > the value of core.sharedrepository from wherever you happen to run it). > > Right. That points to another problem with the way we do things: we leak > global state from discovering a git_dir, which means that we can neither > undo nor override it. > > If we discovered our git_dir in a robust manner, `git init` could say: > hey, this git_dir is actually not what I wanted, here is what I want. > > Likewise, `git submodule` would eventually be able to run in the very same > process as the calling `git`, as would a local fetch. Yep, I agree with all that. I do think things _have_ been improving over the years, and the code is way less tangled than it used to be. But there are so many corner cases, and the code is so fundamental, that it has been slow going. I'd be happy to review patches if you want to push it along. > > So I think the caller of the config code has to provide some kind of > > context about how it is expecting to run and how the value will be used. > > Yep. > > Maybe even go a step further and say that the config code needs a context > "object". If I were writing git from scratch, I'd consider making a "struct repository" object. I'm not sure how painful it would be to retro-fit it at this point. > > Right now if setup_git_directory() or similar hasn't been called, the > > config code does not look. > > Correct. > > Actually, half correct. If setup_git_directory() has not been called, but, > say, git_path() (and thereby implicitly setup_git_env()), the config code > *does* look. At a hard-coded .git/config. Not since b9605bc4f (config: only read .git/config from configured repos, 2016-09-12). That's why pager.c needs its little hack. I guess you could see that as a step backwards, but I think it is necessary one on the road to doing it right. > > Ideally there would be a way for a caller to say "I am running early and > > not even sure yet if we are in a repo; please speculatively try to find > > repo config for me". > > And ideally, it would not roll *yet another* way to discover the git_dir, > but it would reuse the same function (fixing it not to chdir() > unilaterally). Yes, absolutely. > Of course, not using `chdir()` means that we have to figure out symbolic > links somehow, in case somebody works from a symlinked subdirectory, e.g.: > > ln -s $PWD/t/ ~/test-directory > cd ~/test-directory > git log There's work happening elsewhere[1] on making real_path() work without calling chdir(). Which necessarily involves resolving symlinks ourselves. I wonder if we could leverage that work here (ideally by using real_path() under the hood, and not reimplementing the same readlink() logic ourselves). [1] http://public-inbox.org/git/1480964316-99305-1-git-send-email-bmw...@google.com/ > > The pager code does this manually, and without great accuracy; see the > > hack in pager.c's read_early_config(). > > I saw it. And that is what triggered the mail you are responding to (you > probably saw my eye-rolling between the lines). > > The real question is: can we fix this? Or is there simply too gr
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Tue, 6 Dec 2016, Jeff King wrote: > On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote: > > > > Johannes Schindelin writes: > > > > > > > Seriously, do you really think it is a good idea to have > > > > git_config_get_value() *ignore* any value in .git/config? > > > > > > When we do not know ".git/" directory we see is the repository we > > > were told to work on, then we should ignore ".git/config" file. So > > > allowing git_config_get_value() to _ignore_ ".git/config" before the > > > program calls setup_git_directory() does have its uses. > > > > I think you are wrong. This is yet another inconsistent behavior that > > violates the Law of Least Surprises. > > There are surprises in this code any way you turn. If we have not > called setup_git_directory(), then how does git_config_get_value() know > if we are in a repository or not? My biggest surprise, frankly, would be that `git init` and `git clone` are not special-cased. > Should it blindly look at ".git/config"? Absolutely not, of course. You did not need me to say that. > Now your program behaves differently depending on whether you are in the > top-level of the working tree. Exactly. This, BTW, is already how the code would behave if anybody called `git_path()` before `setup_git_directory()`, as the former function implicitly calls `setup_git_env()` which does *not* call `setup_git_directory()` but *does* set up `git_dir` which is then used by `do_git_config_sequence()`.. We have a few of these nasty surprises in our code base, where code silently assumes that global state is set up correctly, and succeeds in sometimes surprising ways if it is not. > Should it speculatively do repo discovery, and use any discovered config > file? Personally, I find the way we discover the repository most irritating. It seems that we have multiple, mutually incompatible code paths (`setup_git_directory()` and `setup_git_env()` come to mind already, and it does not help that consecutive calls to `setup_git_directory()` will yield a very unexpected outcome). Just try to explain to a veteran software engineer why you cannot call `setup_git_directory_gently()` multiple times and expect the very same return value every time. Another irritation is that some commands that clearly would like to use a repository if there is one (such as `git diff`) are *not* marked with RUN_SETUP_GENTLY, due to these unfortunate implementation details. > Now some commands respect config that they shouldn't (e.g., running "git > init foo.git" from inside another repository will accidentally pick up > the value of core.sharedrepository from wherever you happen to run it). Right. That points to another problem with the way we do things: we leak global state from discovering a git_dir, which means that we can neither undo nor override it. If we discovered our git_dir in a robust manner, `git init` could say: hey, this git_dir is actually not what I wanted, here is what I want. Likewise, `git submodule` would eventually be able to run in the very same process as the calling `git`, as would a local fetch. > So I think the caller of the config code has to provide some kind of > context about how it is expecting to run and how the value will be used. Yep. Maybe even go a step further and say that the config code needs a context "object". > Right now if setup_git_directory() or similar hasn't been called, the > config code does not look. Correct. Actually, half correct. If setup_git_directory() has not been called, but, say, git_path() (and thereby implicitly setup_git_env()), the config code *does* look. At a hard-coded .git/config. > Ideally there would be a way for a caller to say "I am running early and > not even sure yet if we are in a repo; please speculatively try to find > repo config for me". And ideally, it would not roll *yet another* way to discover the git_dir, but it would reuse the same function (fixing it not to chdir() unilaterally). Of course, not using `chdir()` means that we have to figure out symbolic links somehow, in case somebody works from a symlinked subdirectory, e.g.: ln -s $PWD/t/ ~/test-directory cd ~/test-directory git log > The pager code does this manually, and without great accuracy; see the > hack in pager.c's read_early_config(). I saw it. And that is what triggered the mail you are responding to (you probably saw my eye-rolling between the lines). The real question is: can we fix this? Or is there simply too great reluctance to change the current code? > I think the way forward is: > > 1. Make that an optional behavior in git_config_with_options() so > other spots can reuse it (probably alias lookup, and something like > your difftool config). Ideally: *any* early call to `git_config_get_value()`. Make things less surprising. > 2. Make it more accurate. Right now it blindly looks in .git/config, > but it should be able to do the usual repo-detection
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Tue, Dec 06, 2016 at 02:16:35PM +0100, Johannes Schindelin wrote: > > Johannes Schindelin writes: > > > > > Seriously, do you really think it is a good idea to have > > > git_config_get_value() *ignore* any value in .git/config? > > > > When we do not know ".git/" directory we see is the repository we were > > told to work on, then we should ignore ".git/config" file. So allowing > > git_config_get_value() to _ignore_ ".git/config" before the program > > calls setup_git_directory() does have its uses. > > I think you are wrong. This is yet another inconsistent behavior that > violates the Law of Least Surprises. There are surprises in this code any way you turn. If we have not called setup_git_directory(), then how does git_config_get_value() know if we are in a repository or not? Should it blindly look at ".git/config"? Now your program behaves differently depending on whether you are in the top-level of the working tree. Should it speculatively do repo discovery, and use any discovered config file? Now some commands respect config that they shouldn't (e.g., running "git init foo.git" from inside another repository will accidentally pick up the value of core.sharedrepository from wherever you happen to run it). So I think the caller of the config code has to provide some kind of context about how it is expecting to run and how the value will be used. Right now if setup_git_directory() or similar hasn't been called, the config code does not look. Ideally there would be a way for a caller to say "I am running early and not even sure yet if we are in a repo; please speculatively try to find repo config for me". The pager code does this manually, and without great accuracy; see the hack in pager.c's read_early_config(). I think the way forward is: 1. Make that an optional behavior in git_config_with_options() so other spots can reuse it (probably alias lookup, and something like your difftool config). 2. Make it more accurate. Right now it blindly looks in .git/config, but it should be able to do the usual repo-detection (_without_ actually entering the repo) to try to find a possible config file. -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Junio, On Mon, 5 Dec 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > Seriously, do you really think it is a good idea to have > > git_config_get_value() *ignore* any value in .git/config? > > When we do not know ".git/" directory we see is the repository we were > told to work on, then we should ignore ".git/config" file. So allowing > git_config_get_value() to _ignore_ ".git/config" before the program > calls setup_git_directory() does have its uses. I think you are wrong. This is yet another inconsistent behavior that violates the Law of Least Surprises. > > We need to fix this. > > I have a feeling that "environment modifications that cannot be undone" > we used as the rationale in 73c2779f42 ("builtin-am: implement skeletal > builtin am", 2015-08-04) might be overly pessimistic and in order to > implement undo_setup_git_directory(), all we need to do may just be > matter of doing a chdir(2) back to prefix and unsetting GIT_PREFIX > environment, but I haven't looked into details of the setup sequence > recently. The problem is that we may not know enough anymore to "undo setup_git_directory()", as some environment variables may have been set before calling Git. If we simply unset the environment variables, we do it incorrectly. On the other hand, the environment variables *may* have been set by setup_git_directory(). In which case we *do* have to unset them. The entire setup_git_directory() logic is a bit of a historically-grown garden. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Johannes Schindelin writes: > Seriously, do you really think it is a good idea to have > git_config_get_value() *ignore* any value in .git/config? When we do not know ".git/" directory we see is the repository we were told to work on, then we should ignore ".git/config" file. So allowing git_config_get_value() to _ignore_ ".git/config" before the program calls setup_git_directory() does have its uses. But I agree that "difftool.useBuiltin" that flips between two implementations is a different use case than the above. Both implementations eventually want to work on ".git/" and would want to read from ".git/config". > We need to fix this. I have a feeling that "environment modifications that cannot be undone" we used as the rationale in 73c2779f42 ("builtin-am: implement skeletal builtin am", 2015-08-04) might be overly pessimistic and in order to implement undo_setup_git_directory(), all we need to do may just be matter of doing a chdir(2) back to prefix and unsetting GIT_PREFIX environment, but I haven't looked into details of the setup sequence recently.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Junio, On Thu, 1 Dec 2016, Junio C Hamano wrote: > As to "I have to spawn config", I think it is sensible to start the > cmd_difftool() wrapper without adding RUN_SETUP to the command table, > then call git_config_get_bool() to check the configuration only from > system and per-user files, and then finally either call into > builtin_difftool() where setup_git_directory() is called, or spawn the > scripted difftool, as Peff already said. I just spent a considerable amount of time to figure out that I overlooked your comment about "only from system and per-user files". > Your "users opt-in while installing" is not about setting per-repository > option. Wow. That would make things really inconsistent. I know that *I* would want to be able to switch that opt-in feature off for repositories where I absolutely rely on some rock-solid difftool. And as a user I would not only be shocked when it would not work as expected, but I would be *positively* shocked to learn that this is intended, by design. Seriously, do you really think it is a good idea to have git_config_get_value() *ignore* any value in .git/config? Really? It caught *me* by surprise, and it definitely makes for a very, very bad user experience. And this is much more fundamental than just difftool.useBuiltin. I see for example that our pager. setting is ignoring per-repository settings. That is, if the user sets pager. in ~/.gitconfig and then tries to override this in a specific repository (which any user would only do for very good reasons, I am sure), Git would happily ignore that careful setup and create an angry user. The only reason this did not blow up in our face yet is that users apparently do not make use of this feature yet. Which does not make this behavior (that "git_config_get_value()" happily ignores .git/config) less broken. We need to fix this. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Johannes Schindelin writes: > The config kinda works now. But for what price. It stole 4 hours I did not > have. When the libexec/git-core/use-builtin-difftool solution took me a > grand total of half an hour to devise, implement and test. > > And you know what? I still do not really see what is so bad about it. I was wondering if I should explain myself again, even though I do not see what good it would do, as clearly my point did not come across in the other emails. And then you would just complain that I am making work for you. Clearly you do not seem to see why placing random files in $GIT_EXEC_PATH, which is a place for git subcommand implementations, is wrong, so I won't repeat it to you again. But you need to remember that you are not working on a Windows-only project. In non-Windows environment, many users would not have write access to /usr/libexec/git-core directory, but it is not just easy for them to write into ~/.gitconfig, but that is the way they are accustomed to, in order to affect the behaviour of Git for them. As to "I have to spawn config", I think it is sensible to start the cmd_difftool() wrapper without adding RUN_SETUP to the command table, then call git_config_get_bool() to check the configuration only from system and per-user files, and then finally either call into builtin_difftool() where setup_git_directory() is called, or spawn the scripted difftool, as Peff already said. Your "users opt-in while installing" is not about setting per-repository option. Calling git_config*(), setup_git_directory() and then git_config*() in this order should be safe, as setup_git_directory() would clear potentially cached configuration values read by any previous git_config*() calls, so any configuration enquiry made by builtin_difftool() would read from all three sources, not just system and per-user. So there is no chicken-and-egg issue, either.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Jakub Narębski writes: >> My original "create a file in libexec/git-core/" was simple, did the job >> reliably, and worked also for testing. >> >> It is a pity that you two gentlemen shot it down for being inelegant. And >> ever since, we try to find a solution that is as simple, works as >> reliably, also for testing, *and* appeases your tastes. > > I just would like to note that existence of file is used for both > git-daemon and gitweb (the latter following the git-daemon example). > > So there is a precedent for the use of this mechanism. I think you are thinking about git-daemon-export-ok (for 'git daemon') and $GITWEB_EXPORT_OK file (for 'gitweb'). You do realize that it is apples-and-oranges [*1*] to take these as analogous to what Dscho is trying to do, don't you? First of all, these are to control access to each repository on the server side; the presence of the file is checked in each repository. What Dscho wants is to control the behaviour of an installation of Git as a whole, no matter which repository is being accessed [*2*, *3*]. More importantly, did you notice that git-daemon-export-ok predates the configuration mechanism by a large margin? The "does the file exist?" check done in a87e8be2ae ("Add a "git-daemon" that listens on a TCP port", 2005-07-13) is a relic from the past [*4*], and 32f4aaccaa ("gitweb: export options", 2006-09-17) added GITWEB_EXPORT_OK to mimic it, also long time ago [*5*]. They are not something you would want to mimic in new programs these days. Besides, $GIT_EXEC_PATH is where you place git subcommands. Who in the right mind considers it even remotely sane to design a system where you have to throw in a file that is not a command to /usr/bin to control the behaviour of your system? [*6*] So the "precedent" is irrelevant in the first place, and even if it were relevant, it is a bad piece of advice to mimic it. [Footnote] *1* Or is it apples-and-pineapples these days? *2* Not that I agree with that desire, if I understand him correctly from his description against the approach based on an environment variable. If a user has multiple installations and not even aware of which one of them s/he is currently using, a mechanism that affects only one of them (instead of consistently affecting all of them) would lead to more confusion, I would think. *3* If such hermetically configured independent installations are desirable, etc/gitconfig aka "git config --system" is a more appropriate thing to use, and you do not need to do repository discovery before you can read it. *4* If we had config mechanism, we would have used it just like we use daemon.* variables to control what services are enabled for each repository. *5* By that time, the config mechanism did already exist, so the GITWEB_EXPORT_OK could have been a per-repository configuration, but "gitweb" had another excuse to deviate from the norm. "Is this repository visible?" was done during repository listing and the script did not want to run "git config" in each and every repository-like directory it encountered in File::Find::find(). *6* And I do not think $GIT_EXEC_PATH vs /usr/bin is apples-and-oranges analogy.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hello, W dniu 28.11.2016 o 18:34, Johannes Schindelin pisze: > My original "create a file in libexec/git-core/" was simple, did the job > reliably, and worked also for testing. > > It is a pity that you two gentlemen shot it down for being inelegant. And > ever since, we try to find a solution that is as simple, works as > reliably, also for testing, *and* appeases your tastes. I just would like to note that existence of file is used for both git-daemon and gitweb (the latter following the git-daemon example). So there is a precedent for the use of this mechanism. Best, -- Jakub Narębski
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Wed, Nov 30, 2016 at 01:30:47PM +0100, Johannes Schindelin wrote: > On Tue, 29 Nov 2016, Jeff King wrote: > > > On Tue, Nov 29, 2016 at 09:36:55PM +0100, Johannes Schindelin wrote: > > > > > So the suggestion by both you and Peff, to use an environment variable, > > > which is either global, or requires the user to set it manually per > > > session, is simply not a good idea at all. > > > > No, my suggestion was to use config and have the test suite use an > > environment variable to test both cases (preferably automatically, > > without the user having to do anything). > > > > I do not see how that fails to cover all of your use cases. > > Oh, so the suggestion is to have *both* a config *and* an environment > variable. That is not elegant. No, that is not at all what I said. I was going to explain myself again, but I do not see what good it would do, as clearly my point did not come across in the other three emails. And then you would just complain that I am making work for you. So whatever. I do not care about your difftool topic at all. Do whatever you like (which hey, I already said before, too). -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Junio, On Tue, 29 Nov 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > So the suggestion by both you and Peff, to use an environment variable, > > which is either global, or requires the user to set it manually per > > session, is simply not a good idea at all. > > As I already said, I do not have a strong preference between config > and env. I raised the env as a possible alternative that you can > think about its pros and cons, and as I already said, if you thought > and your concluded that config would work better for your needs, > that is fine by me. The env flat out fails, on the grounds of not integrating nicely into a Git for Windows installer. The config kinda works now. But for what price. It stole 4 hours I did not have. When the libexec/git-core/use-builtin-difftool solution took me a grand total of half an hour to devise, implement and test. And you know what? I still do not really see what is so bad about it. And I still see what is bad about the config "solution": it *creates* a chicken-and-egg problem with the order of config reading vs running scripts. It *creates* problems for requiring to spawn a `git config` call because reading the config in-process would mess up the global variables and environment *beyond repair*, making it *impossible* to even spawn the git-legacy-difftool Perl script. In short: the config setting now works. But it is ugly as hell. I wish I never had listened to you. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Tue, 29 Nov 2016, Jeff King wrote: > On Tue, Nov 29, 2016 at 09:36:55PM +0100, Johannes Schindelin wrote: > > > So the suggestion by both you and Peff, to use an environment variable, > > which is either global, or requires the user to set it manually per > > session, is simply not a good idea at all. > > No, my suggestion was to use config and have the test suite use an > environment variable to test both cases (preferably automatically, > without the user having to do anything). > > I do not see how that fails to cover all of your use cases. Oh, so the suggestion is to have *both* a config *and* an environment variable. That is not elegant. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Johannes Schindelin writes: > So the suggestion by both you and Peff, to use an environment variable, > which is either global, or requires the user to set it manually per > session, is simply not a good idea at all. As I already said, I do not have a strong preference between config and env. I raised the env as a possible alternative that you can think about its pros and cons, and as I already said, if you thought and your concluded that config would work better for your needs, that is fine by me.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Tue, Nov 29, 2016 at 09:36:55PM +0100, Johannes Schindelin wrote: > So the suggestion by both you and Peff, to use an environment variable, > which is either global, or requires the user to set it manually per > session, is simply not a good idea at all. No, my suggestion was to use config and have the test suite use an environment variable to test both cases (preferably automatically, without the user having to do anything). I do not see how that fails to cover all of your use cases. -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Junio, On Mon, 28 Nov 2016, Junio C Hamano wrote: > Johannes Schindelin writes: > > > However, I have been bitten time and again by problems that occurred only > > in production, our test suite (despite taking already waay too long to > > be truly useful in my daily development) was simply not good enough. > > > > So my plan was different: to let end users opt-in to test this new beast > > thoroughly, more thoroughly than any review would. > > I agree with that 100%. > > [...] > > > And for that, environment variables are just not an option. I need > > something that can be configured in a portable application, so that the > > main Git for Windows installation is unaffected. > > I am not sure I follow here. > > Are you saying that the users who are opting into the experiment > will keep two installations, one for daily use that avoids getting > hit by the experimental code and the other that is used for testing? I have obviously done a real bad job at explaining the Windows situation well enough. Many, many users have multiple installations of Git for Windows. If you have GitHub for Windows and installed the command-line tools: you got one. If you installed Git for Windows, you got another one. If you installed Visual Studio, chances are you have another one. If you got any number of third-party tools requiring Git functionality, you have another one. They all live in separate directories that are their own little pseudo Unix root directory structures, complete with etc/, usr/, var/. Users do not necessarily keep track, or for that matter, are aware of, the multiple different installations. Obviously, I do not want any installation other than the one the user just installed to pick up on the configuration. So the suggestion by both you and Peff, to use an environment variable, which is either global, or requires the user to set it manually per session, is simply not a good idea at all. > > My original "create a file in libexec/git-core/" was simple, did the job > > reliably, and worked also for testing. > > It may have been OK for quick-and-dirty hack during development, but > I do not think it was good in anything released. Well, you say that it is quick and dirty. I say it is the only viable solution I saw so far. All proposed alternative solutions fall flat on their bellies, simply by not working in all the cases I need them to work. As I said elsewhere: I look for a correct solution first, and then I thrive to make it pretty. You start the other way round, and I do not have time for that right now. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Johannes Schindelin writes: > However, I have been bitten time and again by problems that occurred only > in production, our test suite (despite taking already waay too long to > be truly useful in my daily development) was simply not good enough. > > So my plan was different: to let end users opt-in to test this new beast > thoroughly, more thoroughly than any review would. I agree with that 100%. We need to ensure "fallback to known working code" escape hatch is robust for that plan to work well, and that is why (1) I have been more focused on getting 1/2 right, and (2) I do not think it should be Windows-only like in your early plan, and (3) I do not think it would be "this will merely be there for only a month or so", like you said earlier. > And for that, environment variables are just not an option. I need > something that can be configured in a portable application, so that the > main Git for Windows installation is unaffected. I am not sure I follow here. Are you saying that the users who are opting into the experiment will keep two installations, one for daily use that avoids getting hit by the experimental code and the other that is used for testing? How are they switching between the two? By using different %PATH%? I am not sure how it is different from setting an environment $GIT_TEST_BUILTIN_DIFFTOOL. In any case, I do not have strong preference between environment and configuration. If you can make 1/2 robust with configuration, that is just as well. My message you are responding to was merely to suggest another possibility. The latter two points in my four-bullet list are hopefully still viable if you go with the configuration; it may go something like: - The bulk of the tests is moved into a common dot-sourced file, with (1) PERL prerequite stripped and (2) "git difftool" replaced with $git_difftool - Two test files do one of git_difftool="git difftool" git_difftool="git -c difftool.useBuiltin=true difftool" and include the dot-sourced file. The one that does the former needs to give up early depending on PERL prerequisite. perhaps. > My original "create a file in libexec/git-core/" was simple, did the job > reliably, and worked also for testing. It may have been OK for quick-and-dirty hack during development, but I do not think it was good in anything released.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi, On Mon, 28 Nov 2016, Junio C Hamano wrote: > Jeff King writes: > > > On Sat, Nov 26, 2016 at 02:01:36PM +0100, Johannes Schindelin wrote: > > > >> > If you want to control it from outside the test script, you'd need > >> > something like: > >> > > >> > if test "$GIT_TEST_DIFFTOOL" = "builtin" > >> > >> That is a bit magic. I first used "GIT_USE_BUILTIN_DIFFTOOL" and it did > >> not work. My name is arguably more correct (see also Jakub's note about > >> "naming is hard"), but yours works because there is a "TEST" substring in > >> it. > > > > Yes. You are free to add an exception to the env list in test-lib.sh, > > but we usually use GIT_TEST_* to avoid having to do so. > > Perhaps > > - The switch between "do I use builtin, or scripted?" mechanism in >1/2 can look at an environment (just like the old "am" rewrite >series did), instead of configuration. This would make the code >a lot more simppler (you do not have to worry about the >interaction between "setup" and .git/config). > > - That environment variable can be named GIT_TEST_BUILTIN_DIFFTOOL; >after all, people are opting into helping to test the new shiny >to make/prove it ready sooner. > > - The bulk of the existing test for difftool can be moved to a >dot-included file (in a way similar to t/annotate-tests are >usable to test both annotate and blame-imitating-annotate). >Existing PERL prerequisites can all be lost. > > - Two tests can include that dot-included file; one would >explicitly unset that environment (and gives up without PERL >prerequisite), while the other explicitly sets it. If my main worry was the test suite, I would agree with this plan. However, I have been bitten time and again by problems that occurred only in production, our test suite (despite taking already waay too long to be truly useful in my daily development) was simply not good enough. So my plan was different: to let end users opt-in to test this new beast thoroughly, more thoroughly than any review would. And for that, environment variables are just not an option. I need something that can be configured in a portable application, so that the main Git for Windows installation is unaffected. My original "create a file in libexec/git-core/" was simple, did the job reliably, and worked also for testing. It is a pity that you two gentlemen shot it down for being inelegant. And ever since, we try to find a solution that is as simple, works as reliably, also for testing, *and* appeases your tastes. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Jeff King writes: > On Sat, Nov 26, 2016 at 02:01:36PM +0100, Johannes Schindelin wrote: > >> > If you want to control it from outside the test script, you'd need >> > something like: >> > >> > if test "$GIT_TEST_DIFFTOOL" = "builtin" >> >> That is a bit magic. I first used "GIT_USE_BUILTIN_DIFFTOOL" and it did >> not work. My name is arguably more correct (see also Jakub's note about >> "naming is hard"), but yours works because there is a "TEST" substring in >> it. > > Yes. You are free to add an exception to the env list in test-lib.sh, > but we usually use GIT_TEST_* to avoid having to do so. Perhaps - The switch between "do I use builtin, or scripted?" mechanism in 1/2 can look at an environment (just like the old "am" rewrite series did), instead of configuration. This would make the code a lot more simppler (you do not have to worry about the interaction between "setup" and .git/config). - That environment variable can be named GIT_TEST_BUILTIN_DIFFTOOL; after all, people are opting into helping to test the new shiny to make/prove it ready sooner. - The bulk of the existing test for difftool can be moved to a dot-included file (in a way similar to t/annotate-tests are usable to test both annotate and blame-imitating-annotate). Existing PERL prerequisites can all be lost. - Two tests can include that dot-included file; one would explicitly unset that environment (and gives up without PERL prerequisite), while the other explicitly sets it.
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Sat, Nov 26, 2016 at 02:01:36PM +0100, Johannes Schindelin wrote: > > If you want to control it from outside the test script, you'd need > > something like: > > > > if test "$GIT_TEST_DIFFTOOL" = "builtin" > > That is a bit magic. I first used "GIT_USE_BUILTIN_DIFFTOOL" and it did > not work. My name is arguably more correct (see also Jakub's note about > "naming is hard"), but yours works because there is a "TEST" substring in > it. Yes. You are free to add an exception to the env list in test-lib.sh, but we usually use GIT_TEST_* to avoid having to do so. -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Sat, 26 Nov 2016, Jeff King wrote: > On Sat, Nov 26, 2016 at 01:22:28PM +0100, Johannes Schindelin wrote: > > > In other words, GIT_CONFIG_PARAMETERS is *explicitly scrubbed* from the > > environment when we run our tests (by the code block between the "before" > > and the "after" statements in the diff above). > > Sorry if I wasn't clear. I meant to modify t7800 to run the tests twice, > once with the existing script and once with the builtin. I.e., to set > the variable after test-lib.sh has done its scrubbing, and then use a > loop or similar to go through the tests twice. > > If you want to control it from outside the test script, you'd need > something like: > > if test "$GIT_TEST_DIFFTOOL" = "builtin" That is a bit magic. I first used "GIT_USE_BUILTIN_DIFFTOOL" and it did not work. My name is arguably more correct (see also Jakub's note about "naming is hard"), but yours works because there is a "TEST" substring in it. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Sat, Nov 26, 2016 at 01:22:28PM +0100, Johannes Schindelin wrote: > In other words, GIT_CONFIG_PARAMETERS is *explicitly scrubbed* from the > environment when we run our tests (by the code block between the "before" > and the "after" statements in the diff above). Sorry if I wasn't clear. I meant to modify t7800 to run the tests twice, once with the existing script and once with the builtin. I.e., to set the variable after test-lib.sh has done its scrubbing, and then use a loop or similar to go through the tests twice. If you want to control it from outside the test script, you'd need something like: if test "$GIT_TEST_DIFFTOOL" = "builtin" then GIT_CONFIG_PARAMETERS=... fi -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Fri, 25 Nov 2016, Jeff King wrote: > On Fri, Nov 25, 2016 at 06:41:23PM +0100, Johannes Schindelin wrote: > > > > Ah, I didn't realize that was a requirement. If this is going to be part > > > of a release and real end-users are going to see it, that does make me > > > think the config option is the better path (than the presence of some > > > file), as it's our standard way of tweaking run-time behavior. > > > > So how do you easily switch back and forth between testing the old vs the > > new difftool via the test suite? > > If it's for a specific tool, I'd consider teaching the test suite to run > the whole script twice: once with the flag set and once without. > > That is sometimes more complicated, though, if the script creates many > sub-repos. An environment variable might be more natural. If you already > support flipping the default via config, you can probably do: > > GIT_CONFIG_PARAMETERS="'difftool.usebuiltin=true'" > export GIT_CONFIG_PARAMETERS Except that that does not work, of course. To figure out why, apply this diff: -- snip -- diff --git a/t/t7800-difftool.sh b/t/t7800-difftool.sh index 17f3008277..27159f65f3 100755 --- a/t/t7800-difftool.sh +++ b/t/t7800-difftool.sh @@ -10,6 +10,9 @@ Testing basic diff tool invocation . ./test-lib.sh +echo "config $(git config difftool.usebuiltin)." >&2 +exit 1 + difftool_test_setup () { test_config diff.tool test-tool && diff --git a/t/test-lib.sh b/t/test-lib.sh index 9980a46133..0ddeded92b 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -86,6 +86,7 @@ EDITOR=: # /usr/xpg4/bin/sh and /bin/ksh to bail out. So keep the unsets # deriving from the command substitution clustered with the other # ones. +echo "before $(git config difftool.usebuiltin)." >&2 unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e ' my @env = keys %ENV; my $ok = join("|", qw( @@ -104,6 +105,7 @@ unset VISUAL EMAIL LANGUAGE COLUMNS $("$PERL_PATH" -e ' my @vars = grep(/^GIT_/ && !/^GIT_($ok)/o, @env); print join("\n", @vars); ') +echo "after $(git config difftool.usebuiltin)." >&2 unset XDG_CONFIG_HOME unset GITPERLLIB GIT_AUTHOR_EMAIL=aut...@example.com -- snap -- and then weep at this output: GIT_CONFIG_PARAMETERS="'difftool.usebuiltin=true'"; export GIT_CONFIG_PARAMETERS; bash t7800-difftool.sh -i -v -x before true. after . Initialized empty Git repository in /home/virtualbox/git/git-for-windows/t/trash directory.t7800-difftool/.git/ config . FATAL: Unexpected exit with code 1 In other words, GIT_CONFIG_PARAMETERS is *explicitly scrubbed* from the environment when we run our tests (by the code block between the "before" and the "after" statements in the diff above). Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Fri, Nov 25, 2016 at 06:41:23PM +0100, Johannes Schindelin wrote: > > Ah, I didn't realize that was a requirement. If this is going to be part > > of a release and real end-users are going to see it, that does make me > > think the config option is the better path (than the presence of some > > file), as it's our standard way of tweaking run-time behavior. > > So how do you easily switch back and forth between testing the old vs the > new difftool via the test suite? If it's for a specific tool, I'd consider teaching the test suite to run the whole script twice: once with the flag set and once without. That is sometimes more complicated, though, if the script creates many sub-repos. An environment variable might be more natural. If you already support flipping the default via config, you can probably do: GIT_CONFIG_PARAMETERS="'difftool.usebuiltin=true'" export GIT_CONFIG_PARAMETERS -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Fri, 25 Nov 2016, Jeff King wrote: > On Fri, Nov 25, 2016 at 12:05:00PM +0100, Johannes Schindelin wrote: > > > > I would have expected it to just be a build-time flag, like: > > > > > > make BUILTIN_DIFFTOOL=Yes test > > > > That works for Git developers. > > > > I want to let as many users as possible test the builtin difftool. > > Hopefully a lot more users than there are Git developers. > > > > Which means that I need a feature flag in production code, not a build > > time flag. > > Ah, I didn't realize that was a requirement. If this is going to be part > of a release and real end-users are going to see it, that does make me > think the config option is the better path (than the presence of some > file), as it's our standard way of tweaking run-time behavior. So how do you easily switch back and forth between testing the old vs the new difftool via the test suite? Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Fri, Nov 25, 2016 at 12:05:00PM +0100, Johannes Schindelin wrote: > > I would have expected it to just be a build-time flag, like: > > > > make BUILTIN_DIFFTOOL=Yes test > > That works for Git developers. > > I want to let as many users as possible test the builtin difftool. > Hopefully a lot more users than there are Git developers. > > Which means that I need a feature flag in production code, not a build > time flag. Ah, I didn't realize that was a requirement. If this is going to be part of a release and real end-users are going to see it, that does make me think the config option is the better path (than the presence of some file), as it's our standard way of tweaking run-time behavior. The implementation can still remain slightly gross if it's eventually going away. > > I'm happy with pretty much anything under the reasoning of "this does not > > matter much because it is going away soon". > > Yeah, well, I am more happy with anything along the lines of David's > review, pointing out flaws in the current revision of the builtin difftool > before it bites users ;-) Sorry, I can't really help much there, not having much knowledge of difftool. -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Thu, 24 Nov 2016, Jeff King wrote: > On Thu, Nov 24, 2016 at 10:56:23PM +0100, Johannes Schindelin wrote: > > > > I think it would probably be OK to ship with that caveat (people would > > > probably use --global config, or "git -c" for a quick override), but if > > > you really wanted to address it, you can do something like what > > > pager.c:read_early_config() does. > > > > The config setting is already overkill (and does even make something much > > harder than before: running tests with the builtin difftool used to be as > > simply as `touch use-builtin-difftool && make -C t t7800-difftool.sh, now > > I have to edit t7800-difftool.sh to configure difftool.useBuiltin, and > > without the repo-level config even that would not be working). > > > > Imitating read_early_config() would be overkill deluxe. > > I would have expected it to just be a build-time flag, like: > > make BUILTIN_DIFFTOOL=Yes test That works for Git developers. I want to let as many users as possible test the builtin difftool. Hopefully a lot more users than there are Git developers. Which means that I need a feature flag in production code, not a build time flag. > I'm happy with pretty much anything under the reasoning of "this does not > matter much because it is going away soon". Yeah, well, I am more happy with anything along the lines of David's review, pointing out flaws in the current revision of the builtin difftool before it bites users ;-) Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Thu, Nov 24, 2016 at 10:56:23PM +0100, Johannes Schindelin wrote: > > I think it would probably be OK to ship with that caveat (people would > > probably use --global config, or "git -c" for a quick override), but if > > you really wanted to address it, you can do something like what > > pager.c:read_early_config() does. > > The config setting is already overkill (and does even make something much > harder than before: running tests with the builtin difftool used to be as > simply as `touch use-builtin-difftool && make -C t t7800-difftool.sh, now > I have to edit t7800-difftool.sh to configure difftool.useBuiltin, and > without the repo-level config even that would not be working). > > Imitating read_early_config() would be overkill deluxe. I would have expected it to just be a build-time flag, like: make BUILTIN_DIFFTOOL=Yes test but I did not closely follow the rest of the conversation, so I am probably just repeating bits that were already said. So probably ignore me. I'm happy with pretty much anything under the reasoning of "this does not matter much because it is going away soon". -Peff
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
Hi Peff, On Thu, 24 Nov 2016, Jeff King wrote: > On Thu, Nov 24, 2016 at 09:55:07PM +0100, Johannes Schindelin wrote: > > > +/* > > + * NEEDSWORK: this function can go once the legacy-difftool Perl script is > > + * retired. > > + * > > + * We intentionally avoid reading the config directly here, to avoid > > messing up > > + * the GIT_* environment variables when we need to fall back to exec()ing > > the > > + * Perl script. > > + */ > > +static int use_builtin_difftool(void) { > > + struct child_process cp = CHILD_PROCESS_INIT; > > + struct strbuf out = STRBUF_INIT; > > + int ret; > > + > > + argv_array_pushl(&cp.args, > > +"config", "--bool", "difftool.usebuiltin", NULL); > > + cp.git_cmd = 1; > > + if (capture_command(&cp, &out, 6)) > > + return 0; > > + strbuf_trim(&out); > > + ret = !strcmp("true", out.buf); > > + strbuf_release(&out); > > + return ret; > > +} > > FWIW, it should mostly Just Work to use the internal config functions > these days, with the caveat that they will not read repo-level config if > you haven't done repo setup yet. > > I think it would probably be OK to ship with that caveat (people would > probably use --global config, or "git -c" for a quick override), but if > you really wanted to address it, you can do something like what > pager.c:read_early_config() does. The config setting is already overkill (and does even make something much harder than before: running tests with the builtin difftool used to be as simply as `touch use-builtin-difftool && make -C t t7800-difftool.sh, now I have to edit t7800-difftool.sh to configure difftool.useBuiltin, and without the repo-level config even that would not be working). Imitating read_early_config() would be overkill deluxe. > Of course, your method here is fine, too; I just know you are sensitive > to forking extra processes. > > Also, a minor nit: capture_command() might return data in "out" with a > non-zero exit if the command both generates stdout and exits non-zero > itself. I'm not sure that's possible with git-config, though, so it > might not be worth worrying about. As it is, I spent way too much time on a feature flag *that will go away real soon*. And not only I spent too much time on it: everybody who even bothered to think about it spent too much time on it. It is a temporary feature flag. It will go away. If it is inefficient, or inelegant, it won't matter in a month from now. In other words, it was not really necessary to spend all of that time and all of that brain power to first discuss the shortcomings of having the presence of a file in exec path as a feature flag, then converting it into a config setting, only to find out that this *causes* problems that were not there before. Frankly, that was not what I was hoping for. I was hoping to get a decent review of the *difftool* functionality. David did a fine job to provide that review. All that discussion about the temporary feature flag was really a side track for me, and I hate to admit that I let myself get sucked into it, and it cost me quite some time that I would have rather spent on release engineering the preview based on v2.11.0-rc3 (including the difftool with *any* type of working opt-in feature flag). Maybe I will write more explicitly in the next cover letter what I intend to do, and what parts of the patch series is intended to be throw-away material (hence "good enough" is good enough, and discussions about it are really wasting all of our time). Of course, I cannot dictate what other people find interesting to comment on, but at least I will have a good excuse to ignore suggestions that only distract from the work that is really needed. My only solace is that I did some substantially more intensive testing in the wake of this discussion, thanks to the test suite breakages incurred by switching the feature flag to a soon-to-be-abandoned config setting. That makes me much more confident about the builtin difftool, which is all I wanted in the first place. Ciao, Dscho
Re: [PATCH v3 1/2] difftool: add a skeleton for the upcoming builtin
On Thu, Nov 24, 2016 at 09:55:07PM +0100, Johannes Schindelin wrote: > +/* > + * NEEDSWORK: this function can go once the legacy-difftool Perl script is > + * retired. > + * > + * We intentionally avoid reading the config directly here, to avoid messing > up > + * the GIT_* environment variables when we need to fall back to exec()ing the > + * Perl script. > + */ > +static int use_builtin_difftool(void) { > + struct child_process cp = CHILD_PROCESS_INIT; > + struct strbuf out = STRBUF_INIT; > + int ret; > + > + argv_array_pushl(&cp.args, > + "config", "--bool", "difftool.usebuiltin", NULL); > + cp.git_cmd = 1; > + if (capture_command(&cp, &out, 6)) > + return 0; > + strbuf_trim(&out); > + ret = !strcmp("true", out.buf); > + strbuf_release(&out); > + return ret; > +} FWIW, it should mostly Just Work to use the internal config functions these days, with the caveat that they will not read repo-level config if you haven't done repo setup yet. I think it would probably be OK to ship with that caveat (people would probably use --global config, or "git -c" for a quick override), but if you really wanted to address it, you can do something like what pager.c:read_early_config() does. Of course, your method here is fine, too; I just know you are sensitive to forking extra processes. Also, a minor nit: capture_command() might return data in "out" with a non-zero exit if the command both generates stdout and exits non-zero itself. I'm not sure that's possible with git-config, though, so it might not be worth worrying about. -Peff