Re: git status always modifies index?
Jonathan Niederwrites: > I am worried that the project is not learning from what happened here. > ... > Fair enough, though that feels like overengineering. But I *still* > don't see what that has to do with the name "no-optional-locks". When > is a lock *optional*? And how am I supposed to discover this option? > > This also came up during review, and I am worried that this review > feedback is being ignored. In other words, I have no reason to > believe it won't happen again. I too would like to see this part explained a bit better.
Re: git status always modifies index?
On Mon, Nov 27, 2017 at 12:57:31PM -0800, Jonathan Nieder wrote: > > I actually consider "--no-optional-locks" to be such an aspirational > > feature. I didn't go digging for other cases (though I'm fairly certain > > that "diff" has one), but hoped to leave it for further bug reports ("I > > used the option, ran command X, and saw lock contention"). > > I am worried that the project is not learning from what happened here. > > My main issue with the --no-optional-locks name is that it does not > connect to the underlying user need. Your main argument for it is > that it exactly describes the underlying user need. One of us has to > be wrong. Or there's a false dichotomy. ;) We could be talking about two different users. > So let me describe my naive reading: > > As a user, I want to inspect the state of the repository without > disrupting it in any way. That means not breaking concurrent > processes and not upsetting permissions. --read-only seems to > describe this use case to me perfectly. That does not match the request that I got from real script writers who were having a problem. They wanted to avoid lock contention with background tasks. They don't care if the repository is modified as long as it is done in a safe and non-conflicting way. I agree (as I think I've said already in this thread) that --read-only would be a superset of that. And that it would probably be OK to have just gone there in the first place, sacrificing a small amount of specificity in the name of having fewer knobs for the user to turn. > If I understood correctly, your objection is that --read-only is not > specific enough. What I really want, you might say, is not to break > concurrent processes. Any other aspects of being read-only are not > relevant. E.g. if I can refresh the on-disk index using O_APPEND > without disrupting concurrent processes then I should be satisfied > with that. Do I have an objection? It's not clear to me that anybody is actually proposing anything concrete for me to object to. Are we adding "--read-only"? Are we going back to "status --no-lock-index"? In either case, are we deprecating "--no-optional-locks"? It sounds like you are arguing for the first, and it sounds like Dscho is arguing for the second. Frankly, I don't really care that much. I've said all that I can on why I chose the direction I did, and I remain unconvinced that we have evidence that the current option is somehow impossible to find. If somebody wants to take us down one of the other roads, that's fine by me. > Fair enough, though that feels like overengineering. But I *still* > don't see what that has to do with the name "no-optional-locks". When > is a lock *optional*? And how am I supposed to discover this option? I kind of feel like any answer I give to these questions is just going to be waved aside. But here are my earnest answers: 1. You are bit by lock contention, where running operation X ends up with some error like "unable to create index.lock: file exists". "X" is probably something like "commit". 2. You search the documentation for options related to locks. You're not likely to find it in the manpage for X, since the root of the problem actually has nothing to do with X in the first place. It's a background task running "status" that is the problem. 3. You might find it in git(1) while searching for information on locks, since "lock" is in the name of the option (and is in fact the only hit in that page). The index is also mentioned there (though searching for "index" yields a lot more hits). 4. You might find it in git-status(1) if you suspect that "status" is at work. Searching for "index" or "lock" turns up the addition I just proposed yesterday. There are obviously a lot of places where that sequence might fail to find a hit. But the same is true of just about any option, including putting "--read-only" into git(1). > This also came up during review, and I am worried that this review > feedback is being ignored. In other words, I have no reason to > believe it won't happen again. I'm having a hard time figuring out what you mean here. Do you mean that I ignored feedback on this topic during the initial review? Looking at the original thread, I just don't see it. There was some question about the name. I tried to lay out my thinking here: https://public-inbox.org/git/20170921050835.mrbgx2zryy3ju...@sigill.intra.peff.net/ and ended with: I am open to a better name, but I could not come up with one. There was no meaningful response on the topic. When I reposted v2, I tried to bring attention to that with: - there was some discussion over the name. I didn't see other suggestions, and I didn't come up with anything better. So...am I missing something? Am I misunderstanding your point? > > I would be fine with having a further aspirational "read only" mode. > > Excellent, we seem to agree on this much.
Re: git status always modifies index?
Hi, Jeff King wrote: > On Sun, Nov 26, 2017 at 06:35:56PM +0900, Junio C Hamano wrote: >> Having a large picture option like "--read-only" instead of ending >> up with dozens of "we implemented a knob to tweak only this little >> piece, and here is an option to trigger it" would help users in the >> long run, but we traditionally did not do so because we tend to >> avoid shipping "incomplete" features, but being perfectionist with >> such a large undertaking can stall topics with feature bloat. In a >> case like this, however, I suspect that an aspirational feature that >> starts small, promises little and can be extended over time may be a >> good way to move forward. > > I actually consider "--no-optional-locks" to be such an aspirational > feature. I didn't go digging for other cases (though I'm fairly certain > that "diff" has one), but hoped to leave it for further bug reports ("I > used the option, ran command X, and saw lock contention"). I am worried that the project is not learning from what happened here. My main issue with the --no-optional-locks name is that it does not connect to the underlying user need. Your main argument for it is that it exactly describes the underlying user need. One of us has to be wrong. So let me describe my naive reading: As a user, I want to inspect the state of the repository without disrupting it in any way. That means not breaking concurrent processes and not upsetting permissions. --read-only seems to describe this use case to me perfectly. If I understood correctly, your objection is that --read-only is not specific enough. What I really want, you might say, is not to break concurrent processes. Any other aspects of being read-only are not relevant. E.g. if I can refresh the on-disk index using O_APPEND without disrupting concurrent processes then I should be satisfied with that. Fair enough, though that feels like overengineering. But I *still* don't see what that has to do with the name "no-optional-locks". When is a lock *optional*? And how am I supposed to discover this option? This also came up during review, and I am worried that this review feedback is being ignored. In other words, I have no reason to believe it won't happen again. > I would be fine with having a further aspirational "read only" mode. Excellent, we seem to agree on this much. If I can find time for it today then I'll write a patch. Thanks, Jonathan
Re: git status always modifies index?
Hi Junio, On Mon, 27 Nov 2017, Junio C Hamano wrote: > Jeff Kingwrites: > > > Again, maybe the bit above explains my viewpoint a bit more. I'm > > certainly sympathetic to the pain of upstreaming. > > > > I do disagree with the "no good reason" for this particular patch. > > > > Certainly you should feel free to present your hunches. I'd expect you > > to as part of the review (I'm pretty sure I even solicited your opinion > > when I sent the original patch). But I also think it's important for > > patches sent upstream to get thorough review (both for code and design). > > The patches having been in another fork (and thus presumably being > > stable) is one point in their favor, but I don't think it should trumps > > all other discussion. > > I haven't been following this subthread closely, but I agree. I > think your turning a narrow option that was only about status into > something that can be extended as a more general option resulted in > a better design overall. The --no-optional-locks feature is - hard to find, - in the current scenarios less desirable than a very concrete "do not write index.lock files in `git status`", - too simple to introduce to merit introducing it *before* a need for it arises that is larger than `git status --no-lock-index`, and you would still have to keep the latter because it is a very concrete and real use case that is unlikely to want to avoid other .lock files too. So while you two are happily on agreeing with one another, the reality is that this supposedly better design is nothing else than premature optimization. Ciao, Dscho
Re: git status always modifies index?
Hi, Johannes Schindelin wrote: > On Mon, 27 Nov 2017, Jeff King wrote: >> [...] IMHO it argues for GfW trying to land patches upstream first, and >> then having them trickle in as you merge upstream releases. > > You know that I tried that, and you know why I do not do that anymore: it > simply takes too long, and the review on the list focuses on things I > cannot focus on as much, I need to make sure that the patches *work* > first, whereas the patch review on the Git mailing list tends to ensure > that they have the proper form first. > > I upstream patches when I have time. You have been developing in the open, so no complaints from me, just a second point of reference: For Google's internal use we sometimes have needed a patch faster than upstream can review it. Our approach in those cases has been to send a patch to the mailing list and then apply it internally immediately. If upstream is stalled for months on review, so be it --- we already have the patch. But this tends to help ensure that we are moving in the same direction. That said, I don't think that was the main issue with --no-optional-locks. I'll comment more on that in another subthread. Thanks, Jonathan
Re: git status always modifies index?
Hi Peff, On Mon, 27 Nov 2017, Jeff King wrote: > [...] IMHO it argues for GfW trying to land patches upstream first, and > then having them trickle in as you merge upstream releases. You know that I tried that, and you know why I do not do that anymore: it simply takes too long, and the review on the list focuses on things I cannot focus on as much, I need to make sure that the patches *work* first, whereas the patch review on the Git mailing list tends to ensure that they have the proper form first. I upstream patches when I have time. Ciao, Dscho
Re: git status always modifies index?
On Mon, Nov 27, 2017 at 09:47:20AM +0900, Junio C Hamano wrote: > Jeff Kingwrites: > > > I'm not sure I agree. Lockless writes are actually fine for the original > > use case of --no-optional-locks (which is a process for the same user > > that just happens to run in the background). > > The phrase "lockless write" scares me---it sounds as if you > overwrite the index file no matter what other people (including > another instance of yourself) are doing to it. Ick, no, that would be quite bad. ;) I only meant that if we "somehow" had a way in the future to update the stat cache without affecting the other parts of the index, and without causing lock contention that causes other readers to barf, it could be triggered even under this option. That would be quite different from the current index and stat-cache design, and I have no plans in that area. Writes to the object database _are_ lockless now (it is OK if two writers collide, because they are by definition writing the same data). And I wouldn't expect them to be affected by --no-optional-locks. I think elsewhere in the thread you mentioned writing out trees for cache-tree, which seems like a plausible example. Usually there's not much point if you're not going to write out the index with the new cache-tree entries, too. But I could see a program wanting to convert the index into a tree in order to speed up a series of tree-to-index diffs within a single program. This is all pretty hypothetical, though. -Peff
Re: git status always modifies index?
Jeff Kingwrites: > Again, maybe the bit above explains my viewpoint a bit more. I'm > certainly sympathetic to the pain of upstreaming. > > I do disagree with the "no good reason" for this particular patch. > > Certainly you should feel free to present your hunches. I'd expect you > to as part of the review (I'm pretty sure I even solicited your opinion > when I sent the original patch). But I also think it's important for > patches sent upstream to get thorough review (both for code and design). > The patches having been in another fork (and thus presumably being > stable) is one point in their favor, but I don't think it should trumps > all other discussion. I haven't been following this subthread closely, but I agree. I think your turning a narrow option that was only about status into something that can be extended as a more general option resulted in a better design overall. I am guessing that a little voice in his head said "this may be applicable wider than Windows and it will be better to be humble and receptive to others' suggestions by going to the list and get this patch reviewed" was overridden by other needs, like expediency, when he did the initial "covers only status and its opportunistic writing of the index" as a Windows only thing, and Dscho is now regretting not following that initial hunch, as that resulted in unnecessary pain for both himself and his users. I am sympathetic, but we are all normal human and I do not think and mistakes like this can be avoided. Often we are blinded by the immediate issue in front of us and we lose sight of a bigger picture, and it is OK as long as we learn from our (or better yet, others') mistakes. Thanks.
Re: git status always modifies index?
On Sun, Nov 26, 2017 at 10:55:01PM +0100, Johannes Schindelin wrote: > > I remain unconvinced that we have actually uncovered a serious problem. > > You did not. A colleague of mine did. And it was a problem in Git for > Windows only, caused by the changes necessitated by yours (which even used > my tests, which made it easy for my conflict resolution to do the wrong > thing by removing my --no-lock-index test in favor of your > --no-optional-locks test, breaking --no-lock-index). > > It cost me almost two work days, and a lot of hair. And all I meant by "I > hinted at it, too" was that I felt that something like that was coming > when I saw your variation of my patches making it into git/git's master. I was confused by your mention of a problem, since this was the first I heard about it. Looking at the GfW repo, I assume you mean the bits touched by 45538830baf. If so, then yes, I'm sad that the combination of the features caused extra work for you. But I also don't think that is a compelling reason to say that "--no-optional-locks" is the wrong approach. It _does_ argue for trying to take features intact between the two codebases. But I am not sure I buy that argument. The original feature got no review on the list, and in fact most of us weren't even aware of it until encountering the problem independently. IMHO it argues for GfW trying to land patches upstream first, and then having them trickle in as you merge upstream releases. I suspect you are going to say "but I am busy and don't have time for that". And I know it takes time. I maintain the fork that GitHub runs on its servers, and I have a backlog of features to upstream. Some of them end up quite different when I do that, and it's a huge pain. But ultimately I've forked the upstream project, and that's the price I pay. Upstream doesn't care about my fork's problems. I dunno. Maybe you do not see Git for Windows as such a fork. But speaking as somebody who works on git.git, that is my perception of it (that GfW is downstream). So I'm sympathetic, but I don't like the idea of taking non-Windows-specific patches wholesale and skipping the list review and design process. > > Somebody asked if Git could do a thing, and people pointed him to the > > right option. > > If people have to ask on the mailing list even after reading the man > pages, that's a strong indicator that we could do better. Sure. That's why I suggested improving the documentation in my last email. But in all the discussion, I haven't seen any patch to that effect. > In Git, yes. In Git for Windows, no. And it worked beautifully in Git for > Windows before v2.15.0. It didn't in git.git, because it wasn't there. ;) > > But I figured you would carry "status --no-lock-index" forever in Git > > for Windows anyway (after all, if you remove it now, you're breaking > > compatibility for existing users). > > I will not carry it forever. Most definitely not. It was marked as > experimental for a reason: I suspected that major changes would be > required to get it accepted into git.git (even if I disagree from a purely > practicial point of view that those changes are required, but that's what > you have to accept when working in Open Source, that you sometimes have to > change something solely to please the person who can reject your patches). Yes, I saw just now that you continue to recognize it and give a deprecation warning, which seems like quite a reasonable thing to do. > Sure, I am breaking compatibility for existing users, but that is more the > fault of --no-optional-locks being introduced than anything else. Hopefully the text at the start of this mail explains why I disagree on the "fault" here. > I am pretty much done talking about this subject at this point. I only > started talking about it because I wanted you to understand that I will > insist on my hunches more forcefully in the future, and I hope you will > see why I do that. But then, you may not even see the problems caused by > the renaming (and forced broader scope for currently no good reason) of > --no-lock-index, so maybe you disagree that acting on my hunch would have > prevented those problems. Again, maybe the bit above explains my viewpoint a bit more. I'm certainly sympathetic to the pain of upstreaming. I do disagree with the "no good reason" for this particular patch. Certainly you should feel free to present your hunches. I'd expect you to as part of the review (I'm pretty sure I even solicited your opinion when I sent the original patch). But I also think it's important for patches sent upstream to get thorough review (both for code and design). The patches having been in another fork (and thus presumably being stable) is one point in their favor, but I don't think it should trumps all other discussion. -Peff
Re: git status always modifies index?
On Mon, Nov 27, 2017 at 01:56:41PM +0900, Junio C Hamano wrote: > Jeff Kingwrites: > > > I actually consider "--no-optional-locks" to be such an aspirational > > feature. I didn't go digging for other cases (though I'm fairly certain > > that "diff" has one), but hoped to leave it for further bug reports ("I > > used the option, ran command X, and saw lock contention"). > > OK, then we are essentially on the same page. I just was hoping > that we can restrain ourselves from adding these "non essential" > knobs at too fine granularity, ending up forcing end users to use > all of them. Yes, I agree we should try not to have too many knobs. That's actually one of the reasons I avoided a status-only option in the first place. In retrospect, I agree that the current option probably doesn't get the granularity quite right. The idea of "totally read-only" just didn't cross my mind at all when working on the earlier feature. -Peff
Re: git status always modifies index?
Jeff Kingwrites: > I actually consider "--no-optional-locks" to be such an aspirational > feature. I didn't go digging for other cases (though I'm fairly certain > that "diff" has one), but hoped to leave it for further bug reports ("I > used the option, ran command X, and saw lock contention"). OK, then we are essentially on the same page. I just was hoping that we can restrain ourselves from adding these "non essential" knobs at too fine granularity, ending up forcing end users to use all of them.
Re: git status always modifies index?
On Sun, Nov 26, 2017 at 06:35:56PM +0900, Junio C Hamano wrote: > Having a large picture option like "--read-only" instead of ending > up with dozens of "we implemented a knob to tweak only this little > piece, and here is an option to trigger it" would help users in the > long run, but we traditionally did not do so because we tend to > avoid shipping "incomplete" features, but being perfectionist with > such a large undertaking can stall topics with feature bloat. In a > case like this, however, I suspect that an aspirational feature that > starts small, promises little and can be extended over time may be a > good way to move forward. I actually consider "--no-optional-locks" to be such an aspirational feature. I didn't go digging for other cases (though I'm fairly certain that "diff" has one), but hoped to leave it for further bug reports ("I used the option, ran command X, and saw lock contention"). I would be fine with having a further aspirational "read only" mode. As I said before, that's not quite the same thing as no-optional-locks, but I think they're close enough that I'd be fine having only one of them. But now that we've shipped a version with the locking one, we're stuck with at least for the duration of a deprecation cycle. -Peff
Re: git status always modifies index?
Jeff Kingwrites: > I'm not sure I agree. Lockless writes are actually fine for the original > use case of --no-optional-locks (which is a process for the same user > that just happens to run in the background). The phrase "lockless write" scares me---it sounds as if you overwrite the index file no matter what other people (including another instance of yourself) are doing to it. Side note: What 'use-optional-locks' actually does is not to give any file descriptor to write into when we invoke the wt-status helpers (which would want to make an opportunistic update to the index under the lock), so "--no-optional-locks" is quite different from "lockless write". Whew. It is part of what "semantically read-only things do not write" would have been. How would a true "lockless write" that is OK for background opportunistic refresh work? Read, compute and then open the final index file under its final name for writing and write it out, without involving any rename? As long as it finishes writing the result in full and closes, its competing with a real "lockful write" would probably be safe when it loses (the lockful one will rename its result over to the refreshed one). It cannot "win" the race by writing into the temporary lock file the other party is using ;-) But it may lose the race in a messy way---the lockful one tries to rename its result over to the real index, which the lockless one has still open and writing. Unix variants are probably OK with it and the lockless one would lose gracefully, but on other platforms the lockful one would fail to rename, I suspect? Or the lockless one can crash while it is writing even if there is no race. Or do you mean something different by "lockless write"?
Re: git status always modifies index?
Hi Peff, On Sun, 26 Nov 2017, Jeff King wrote: > On Sat, Nov 25, 2017 at 10:55:25PM +0100, Johannes Schindelin wrote: > > > > Right, I went a little off track of your original point. > > > > > > What I was trying to get at is that naming it "status --no-lock-index" > > > would not be the same thing (even though with the current implementation > > > it would behave the same). IOW, can we improve the documentation of > > > "status" to point to make it easier to discover this use case. > > > > I had the hunch that renaming the option (and moving it away from `git > > status`, even if it is currently only affecting `git status` and even if > > it will most likely be desirable to have the option to really only prevent > > `git status` from writing .lock files) was an unfortunate decision (and > > made my life as Git for Windows quite a bit harder than really necessary, > > it cost me over one workday of a bug hunt, mainly due to a false flag > > indicating `git rebase` to be the culprit). And I hinted at it, too. > > I remain unconvinced that we have actually uncovered a serious problem. You did not. A colleague of mine did. And it was a problem in Git for Windows only, caused by the changes necessitated by yours (which even used my tests, which made it easy for my conflict resolution to do the wrong thing by removing my --no-lock-index test in favor of your --no-optional-locks test, breaking --no-lock-index). It cost me almost two work days, and a lot of hair. And all I meant by "I hinted at it, too" was that I felt that something like that was coming when I saw your variation of my patches making it into git/git's master. This kind of stuff really throws my upstreaming back quite a bit, not only due to lost time, but also due to the frustration with the caused regressions. Now, the report indicates that not only Git for Windows had a problem, but that the new feature is unnecessarily unintuitive. I would even claim that the --no-lock-index option (even if it does not say "--read-only") would have made for a better user experience because it is at least in the expected place: the `git status` man page. > Somebody asked if Git could do a thing, and people pointed him to the > right option. If people have to ask on the mailing list even after reading the man pages, that's a strong indicator that we could do better. > That option is new in the latest release. In Git, yes. In Git for Windows, no. And it worked beautifully in Git for Windows before v2.15.0. > > I really never understood why --no-optional-locks had to be introduced > > when it did exactly the same as --no-lock-index, and when the latter has a > > right to exist in the first place, even in the purely hypothetical case > > that we teach --no-optional-locks to handle more cases than just `git > > status`' writing of the index (and in essence, it looks like premature > > optimization): it is a very concrete use case that a user may want `git > > status` to refrain from even trying to write any file, as this thread > > shows very eloquently. > > Besides potentially handling more than just "git status", ... which is a premature optimization... > it differs in one other way: it can be triggered via and is carried > through the environment. ... which Git for Windows' --no-lock-index *also* had to do (think submodules). We simply figured that out only after introducing the option, therefore it was carried as an add-on commit, and the plan was to squash it in before upstreaming (obviously!). So I contest your claim. `--no-lock-index` must be propagated to callees in the same way as the (still hypothetical) `--no-optional-locks` that would cover more than just `git status`. > > Maybe it is time to reintroduce --no-lock-index, and make > > --no-optional-locks' functionality a true superset of --no-lock-index'. > > I'm not against having a separate option for "never write to the > repository". Whoa, slow down. We already introduced the `--no-optional-locks` option for a completely hypothetical use case covering more than just `git status`, a use case that may very well never see the light of day. (At least it was my undederstanding that the conjecture of something like that maybe being needed by somebody some time in the future was the entire reason tobutcher the --no-lock-index approach into a very different looking --no-optional-locks that is much harder to find in the documentation.) Let's not introduce yet another option for a completely hypothetical use case that may be even more theoretical. > I think it's potentially different than "don't lock", as I > mentioned earlier. I don't see the need at all at the moemnt. > Frankly I don't see much value in "--no-lock-index" if we already have > "--no-optional-locks". Funny. I did not (and still do not) see the need for renaming `git status --no-lock-index` to `git --no-optional-locks status` (i.e. cluttering the global option space for something that really only `git status` needs).
Re: git status always modifies index?
On Sun, Nov 26, 2017 at 12:32:13PM +0900, Junio C Hamano wrote: > Jeff Kingwrites: > > > What I was trying to get at is that naming it "status --no-lock-index" > > would not be the same thing (even though with the current implementation > > it would behave the same). IOW, can we improve the documentation of > > "status" to point to make it easier to discover this use case. > > Yeah, the name is unfortunate. > > What the end user really wants to see, I suspect, is a "--read-only" > option that applies to any filesystem entity and to any command, in > the context of this thread, and also in the original discussion that > led to the introduction of that option. I'm not sure I agree. Lockless writes are actually fine for the original use case of --no-optional-locks (which is a process for the same user that just happens to run in the background). I can buy the distinction between that and "--read-only" as premature optimization, though, since it's not common for most operations to do non-locking writes (pretty much object writes are the only thing, and most "semantically read-only" operations like status or diff do not write any objects). So there's very little lost by people in the first boat saying "--read-only". -Peff
Re: git status always modifies index?
On Sat, Nov 25, 2017 at 10:55:25PM +0100, Johannes Schindelin wrote: > > Right, I went a little off track of your original point. > > > > What I was trying to get at is that naming it "status --no-lock-index" > > would not be the same thing (even though with the current implementation > > it would behave the same). IOW, can we improve the documentation of > > "status" to point to make it easier to discover this use case. > > I had the hunch that renaming the option (and moving it away from `git > status`, even if it is currently only affecting `git status` and even if > it will most likely be desirable to have the option to really only prevent > `git status` from writing .lock files) was an unfortunate decision (and > made my life as Git for Windows quite a bit harder than really necessary, > it cost me over one workday of a bug hunt, mainly due to a false flag > indicating `git rebase` to be the culprit). And I hinted at it, too. I remain unconvinced that we have actually uncovered a serious problem. Somebody asked if Git could do a thing, and people pointed him to the right option. That option is new in the latest release. So it is entirely plausible to me that the new option is just fine and: 1. We could adjust the documentation to cross-reference from git-status. 2. Now that the new option exists, informal documentation will start to mention it. Including this thread in the mailing list archive, and the stack overflow thread that was linked. > I really never understood why --no-optional-locks had to be introduced > when it did exactly the same as --no-lock-index, and when the latter has a > right to exist in the first place, even in the purely hypothetical case > that we teach --no-optional-locks to handle more cases than just `git > status`' writing of the index (and in essence, it looks like premature > optimization): it is a very concrete use case that a user may want `git > status` to refrain from even trying to write any file, as this thread > shows very eloquently. Besides potentially handling more than just "git status", it differs in one other way: it can be triggered via and is carried through the environment. > Maybe it is time to reintroduce --no-lock-index, and make > --no-optional-locks' functionality a true superset of --no-lock-index'. I'm not against having a separate option for "never write to the repository". I think it's potentially different than "don't lock", as I mentioned earlier. Frankly I don't see much value in "--no-lock-index" if we already have "--no-optional-locks". But I figured you would carry "status --no-lock-index" forever in Git for Windows anyway (after all, if you remove it now, you're breaking compatibility for existing users). -Peff
Re: git status always modifies index?
Junio C Hamanowrites: > Jeff King writes: > >> What I was trying to get at is that naming it "status --no-lock-index" >> would not be the same thing (even though with the current implementation >> it would behave the same). IOW, can we improve the documentation of >> "status" to point to make it easier to discover this use case. > > Yeah, the name is unfortunate. > > What the end user really wants to see, I suspect, is a "--read-only" > option that applies to any filesystem entity and to any command, in > the context of this thread, and also in the original discussion that > led to the introduction of that option. > > While I think the variable losing "index" from its name was a vast > improvement relative to "--no-lock-index", simply because it > expresses what we do a bit closer to "do not just do things without > modifying anything my repository", it did not go far enough. Yuck, the last sentence was garbled. What I meant as the ideal "read-only" was "do things without modifying anything in my repository". And to avoid any misunderstanding, what I mean by "it did not go far enough" is *NOT* this: We added a narrow feature and gave it a narrow name. Instead we should have added a "--read-only" feature, which this change may be a small part of, and waited releasing the whole thing until it is reasonably complete. By going far enough, I was wondering if we should have done something that we historically did not do. An "aspirational" feature that is incrementally released with a known bug and that will give users what they want in the larger picture when completed. IOW, we could have made this "git --read-only ", that is explained initially as "tell Git not to modify repository when it does not have to (e.g. avoid opportunistic update)" and perhaps later as "tell Git not to modify anything in the repository--if it absolutely has to (e.g. "git --read-only commit" is impossible to complete without modifying anything in the repository), error out instead". And with a known-bug section to clearly state that this feature is not something we vetted every codepath to ensure the read-only operation, but is still a work in progress. After all, "status" does not have to stay to be the only command with opportunistic modification (in the current implementation, it does "update-index --refresh" to update the index). And the index does not have to stay to be the only thing that is opportunistically modified (e.g. "git diff --cached" could not just opportunistically update the index, but also it could be taught to write out tree objects for intermediate directories when it does its cache-tree refresh, which would help the diff-index algorithm quite a bit in the performance department). Having a large picture option like "--read-only" instead of ending up with dozens of "we implemented a knob to tweak only this little piece, and here is an option to trigger it" would help users in the long run, but we traditionally did not do so because we tend to avoid shipping "incomplete" features, but being perfectionist with such a large undertaking can stall topics with feature bloat. In a case like this, however, I suspect that an aspirational feature that starts small, promises little and can be extended over time may be a good way to move forward.
Re: git status always modifies index?
Jeff Kingwrites: > What I was trying to get at is that naming it "status --no-lock-index" > would not be the same thing (even though with the current implementation > it would behave the same). IOW, can we improve the documentation of > "status" to point to make it easier to discover this use case. Yeah, the name is unfortunate. What the end user really wants to see, I suspect, is a "--read-only" option that applies to any filesystem entity and to any command, in the context of this thread, and also in the original discussion that led to the introduction of that option. While I think the variable losing "index" from its name was a vast improvement relative to "--no-lock-index", simply because it expresses what we do a bit closer to "do not just do things without modifying anything my repository", it did not go far enough.
Re: git status always modifies index?
Hi Peff, On Wed, 22 Nov 2017, Jeff King wrote: > On Wed, Nov 22, 2017 at 01:56:35PM -0800, Jonathan Nieder wrote: > > > Jeff King wrote: > > > On Wed, Nov 22, 2017 at 12:27:20PM -0800, Jonathan Nieder wrote: > > > > >> That said, I wonder if this use case is an illustration that a name > > >> like --no-lock-index (as was used in Git for Windows when this feature > > >> first appeared) or --no-refresh-on-disk-index (sorry, I am terrible at > > >> coming up with option names) would make the feature easier to > > >> discover. > > [...] > > > Or maybe just living with the minor philosophical rough edges, > > > since it seems OK in practice. > > > > To be clear, my concern is not philosophical but practical: I'm saying > > if it's a "git status" option (or at least shows up in the "git > > status" manpage) and it is memorably about $GIT_DIR/index (at least > > mentions that in its description) then it is more likely to help > > people. > > Right, I went a little off track of your original point. > > What I was trying to get at is that naming it "status --no-lock-index" > would not be the same thing (even though with the current implementation > it would behave the same). IOW, can we improve the documentation of > "status" to point to make it easier to discover this use case. I had the hunch that renaming the option (and moving it away from `git status`, even if it is currently only affecting `git status` and even if it will most likely be desirable to have the option to really only prevent `git status` from writing .lock files) was an unfortunate decision (and made my life as Git for Windows quite a bit harder than really necessary, it cost me over one workday of a bug hunt, mainly due to a false flag indicating `git rebase` to be the culprit). And I hinted at it, too. Maybe I should trust my instincts and act on them more. It is not like it was the first time that I had doubts that turned out to have merit, see e.g. the regression introduced into the formerly rock-solid set_hidden_flag() patches due to changes I made reluctantly during upstreaming, or the regression introduced during v1->v2 in my regex-buf patches that caused problems with mulibc and AIX. I really never understood why --no-optional-locks had to be introduced when it did exactly the same as --no-lock-index, and when the latter has a right to exist in the first place, even in the purely hypothetical case that we teach --no-optional-locks to handle more cases than just `git status`' writing of the index (and in essence, it looks like premature optimization): it is a very concrete use case that a user may want `git status` to refrain from even trying to write any file, as this thread shows very eloquently. Maybe it is time to reintroduce --no-lock-index, and make --no-optional-locks' functionality a true superset of --no-lock-index'. At least that is what my gut feeling tells me should be done. Ciao, Dscho
Re: git status always modifies index?
On Wed, Nov 22, 2017 at 01:56:35PM -0800, Jonathan Nieder wrote: > Jeff King wrote: > > On Wed, Nov 22, 2017 at 12:27:20PM -0800, Jonathan Nieder wrote: > > >> That said, I wonder if this use case is an illustration that a name > >> like --no-lock-index (as was used in Git for Windows when this feature > >> first appeared) or --no-refresh-on-disk-index (sorry, I am terrible at > >> coming up with option names) would make the feature easier to > >> discover. > [...] > > Or maybe just living with the minor philosophical rough edges, > > since it seems OK in practice. > > To be clear, my concern is not philosophical but practical: I'm saying > if it's a "git status" option (or at least shows up in the "git > status" manpage) and it is memorably about $GIT_DIR/index (at least > mentions that in its description) then it is more likely to help > people. Right, I went a little off track of your original point. What I was trying to get at is that naming it "status --no-lock-index" would not be the same thing (even though with the current implementation it would behave the same). IOW, can we improve the documentation of "status" to point to make it easier to discover this use case. -Peff
Re: git status always modifies index?
Jeff King wrote: > On Wed, Nov 22, 2017 at 12:27:20PM -0800, Jonathan Nieder wrote: >> That said, I wonder if this use case is an illustration that a name >> like --no-lock-index (as was used in Git for Windows when this feature >> first appeared) or --no-refresh-on-disk-index (sorry, I am terrible at >> coming up with option names) would make the feature easier to >> discover. [...] > Or maybe just living with the minor philosophical rough edges, > since it seems OK in practice. To be clear, my concern is not philosophical but practical: I'm saying if it's a "git status" option (or at least shows up in the "git status" manpage) and it is memorably about $GIT_DIR/index (at least mentions that in its description) then it is more likely to help people. Thanks, Jonathan
Re: git status always modifies index?
On Wed, Nov 22, 2017 at 12:27:20PM -0800, Jonathan Nieder wrote: > Nathan Neulinger wrote[1]: > > > I just got an answer to my stackoverflow question on this, > > apparently it's already implemented: > > > > https://stackoverflow.com/questions/47436939/how-to-run-git-status-without-modifying-git-index-such-as-in-a-prompt-command > > > > There is a "--no-optional-locks" command in 2.15 that looks like it > > does exactly what I need. > > I was about to point to > https://public-inbox.org/git/20170921043214.pyhdsrpy4omy5...@sigill.intra.peff.net/ > about exactly this thing. :) > > That said, I wonder if this use case is an illustration that a name > like --no-lock-index (as was used in Git for Windows when this feature > first appeared) or --no-refresh-on-disk-index (sorry, I am terrible at > coming up with option names) would make the feature easier to > discover. Yeah, it's interesting that Nathan does not care about the simultaneous locking here, but rather about the effect of writing to the repo for what would otherwise be a read-only operation. Under the original intent of --no-optional-locks I think if we could somehow magically update the on-disk index without lock contention, it would be OK to do so. But that would make it no longer work for this particular case. And I would also not be surprised if there are other cases where we write in a lockless way that would best be avoided in a multi-user setup. I'm thinking specifically of the way that some merge-y operations may write out intermediate objects, even though they're only needed inside the process. It _should_ be a read-only operation to ask "can these two things be merged cleanly", and you should be able to ask that without accidentally creating root-owned files in .git/objects. So I actually think what Nathan wants is not exactly the same as --no-optional-locks in the first place. But in practice, for a limited set of operations and with the way that locks work in Git, it accomplishes the same thing. Maybe that points to having a broader option. Or maybe having two separate options that largely have the same effect. Or maybe just living with the minor philosophical rough edges, since it seems OK in practice. -Peff
Re: git status always modifies index?
Hi, Nathan Neulinger wrote[1]: > I just got an answer to my stackoverflow question on this, > apparently it's already implemented: > > https://stackoverflow.com/questions/47436939/how-to-run-git-status-without-modifying-git-index-such-as-in-a-prompt-command > > There is a "--no-optional-locks" command in 2.15 that looks like it > does exactly what I need. I was about to point to https://public-inbox.org/git/20170921043214.pyhdsrpy4omy5...@sigill.intra.peff.net/ about exactly this thing. :) That said, I wonder if this use case is an illustration that a name like --no-lock-index (as was used in Git for Windows when this feature first appeared) or --no-refresh-on-disk-index (sorry, I am terrible at coming up with option names) would make the feature easier to discover. Thanks, Jonathan [1] https://public-inbox.org/git/dfbf4af3-e87c-bdcb-7544-685572925...@neulinger.org/
Re: git status always modifies index?
Ah, my bad. I missed this patch... Good luck! -Santiago. signature.asc Description: PGP signature
Re: git status always modifies index?
I just got an answer to my stackoverflow question on this, apparently it's already implemented: https://stackoverflow.com/questions/47436939/how-to-run-git-status-without-modifying-git-index-such-as-in-a-prompt-command There is a "--no-optional-locks" command in 2.15 that looks like it does exactly what I need. -- Nathan On 11/22/17 10:10 AM, Santiago Torres wrote: On Wed, Nov 22, 2017 at 09:37:09AM -0600, Nathan Neulinger wrote: What I'm meaning is - why does it need to write the index back out to disk? From looking at the code in builtin/commit.c it looks like it takes a lock on the index, collects the status, and then unconditionally rewrites the index file. Hmm, I just took a look at those lines and I see what you mean. From what I understand, this is to cache the result of the index computation for ensuing git calls. I'm proposing that the update_index_if_able call not actually be issued if it would result in a ownership change on the underlying file - such as a simple case of root user or other privileged account issuing 'git status' in a directory. I don't think this would be a desire-able default behavior. I'd wager that running git status using different accounts is not a great idea --- specially interacting with a user repository as root. I understand completely that it would be expected to be altered if the privileged user did a commit/add or any other operation that was inherently a 'write' operation, but doesn't seem like status should be one of those cases. I think it's because of the reasons above. That being said, I don't know what the rest of the community would think of something akin to a --no-update-index type flag. Cheers! -Santiago. -- Nathan Neulinger nn...@neulinger.org Neulinger Consulting (573) 612-1412
Re: git status always modifies index?
On Wed, Nov 22, 2017 at 09:37:09AM -0600, Nathan Neulinger wrote: > What I'm meaning is - why does it need to write the index back out to disk? > > From looking at the code in builtin/commit.c it looks like it takes a lock > on the index, collects the status, and then unconditionally rewrites the > index file. > Hmm, I just took a look at those lines and I see what you mean. From what I understand, this is to cache the result of the index computation for ensuing git calls. > I'm proposing that the update_index_if_able call not actually be issued if > it would result in a ownership change on the underlying file - such as a > simple case of root user or other privileged account issuing 'git status' in > a directory. I don't think this would be a desire-able default behavior. I'd wager that running git status using different accounts is not a great idea --- specially interacting with a user repository as root. > I understand completely that it would be expected to be altered if the > privileged user did a commit/add or any other operation that was inherently > a 'write' operation, but doesn't seem like status should be one of those > cases. I think it's because of the reasons above. That being said, I don't know what the rest of the community would think of something akin to a --no-update-index type flag. Cheers! -Santiago. signature.asc Description: PGP signature
Re: git status always modifies index?
What I'm meaning is - why does it need to write the index back out to disk? From looking at the code in builtin/commit.c it looks like it takes a lock on the index, collects the status, and then unconditionally rewrites the index file. I'm proposing that the update_index_if_able call not actually be issued if it would result in a ownership change on the underlying file - such as a simple case of root user or other privileged account issuing 'git status' in a directory. I understand completely that it would be expected to be altered if the privileged user did a commit/add or any other operation that was inherently a 'write' operation, but doesn't seem like status should be one of those cases. -- Nathan On 11/22/17 9:30 AM, Santiago Torres wrote: Hi Nathan. Do you mean git-status writing an index file? What would you suggest for git-status to compute which files have changed without modifying an index-file? Or are you suggesting git-status to fail if the index file doesn't belong to the user-id who invoked the command... Thanks, -Santiago -- Nathan Neulinger nn...@neulinger.org Neulinger Consulting (573) 612-1412
Re: git status always modifies index?
Hi Nathan. Do you mean git-status writing an index file? What would you suggest for git-status to compute which files have changed without modifying an index-file? Or are you suggesting git-status to fail if the index file doesn't belong to the user-id who invoked the command... Thanks, -Santiago signature.asc Description: PGP signature