Re: [RFC PATCH] Introduce "precious" file concept

2018-12-06 Thread Duy Nguyen
On Tue, Nov 27, 2018 at 1:56 PM Jacob Keller  wrote:
> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
>
> Either Duy's solution with a sort of "untracked" reflog, or the
> garbage/trashable notion.

The "untracked reflog" is partially functional now [1] if you want to
have a look. I'm not going to post the series until post-2.20, but if
you do look, I suggest the first patch that lays out the design and a
plumbing command to manage it. Basically you'll do

git backup-log --id=worktree log 

then pick up the version you like with "git backup-log cat" and do
whatever you want with it. High level UI is not there and will be a
topic of discussion.

The precious/trashable/garbage notion can be used to suppress making backups.

[1] https://gitlab.com/pclouds/git/commits/backup-log
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-30 Thread Duy Nguyen
On Wed, Nov 28, 2018 at 10:54 PM Ævar Arnfjörð Bjarmason
 wrote:
> But we must have some viable way to repair warts in the tools, and
> losing user data is a *big* wart.
>
> I don't think something like the endgame you've described in
> https://public-inbox.org/git/xmqqzhtwuhpc@gitster-ct.c.googlers.com/
> is ever going to work. Novice git users (the vast majority) are not
> going to diligently update both .gitignore and some .gitattribute
> mechanism in lockstep. I'd bet most git users haven't read more than a
> few paragraphs of our entire documentation at best.
>
> So what's the way forward? I think ultimately we must move to something
> where we effectively version the entire CLI UI similar to stable API
> versions. I.e. for things like this that would break some things (or
> Duy's new "split checkout") introduce them as flags first, then bundle
> up all such flags and cut a major release "Git 3, 4, ...", and
> eventually remove old functionality.

Related to Duy's new "split chekckout", I just realized that I added
--overwrite-ignore (enabled by default) [1] years ago to allow to out
out of this behavior. We could turn --no-overwrite-ignore by default
on the new command "git switch-branch" to err on the safe side. Then
the user could switch to --overwrite-ignore once they learn more about
gitignore and gitattributes (or the coming "backup log"). I'm not sure
if I really like this, but at least it's one of the options.

[1] 
https://public-inbox.org/git/1322388933-6284-2-git-send-email-pclo...@gmail.com/
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-28 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> I don't think something like the endgame you've described in
> https://public-inbox.org/git/xmqqzhtwuhpc@gitster-ct.c.googlers.com/
> is ever going to work. Novice git users (the vast majority) are not
> going to diligently update both .gitignore and some .gitattribute
> mechanism in lockstep.

That goes both ways, no?  Forcing people to add the same pattern,
e.g. *.o, to .gitignore and .gitattribute to say they are
expendable, when most of the time they are, is not something you
should expect from the normal population.

>> I would think that a proper automation needs per-path hint from the
>> user and/or the project, not just a single-size-fits-all --force
>> option, and "unlike all the *.o ignored files that are expendable,
>> this vendor-supplied-object.o is not" is one way to give such a
>> per-path hint.
>>
>>> This would give scripts which relied on our stable plumbing consistent
>>> behavior, while helping users who're using our main porcelain not to
>>> lose data. I could then add a --force option to the likes of read-tree
>>> (on by default), so you could get porcelain-like behavior with
>>> --no-force.
>>
>> At that low level, I suspect that a single size fits all "--force"
>> would work even less well.
>
> Yeah I don't think the one-size-fits-all way out of this is a single
> --force flag.

Yes, indeed.  That's why I prefer the "precious" bit.  The system
would behave the same way with or without it, but projects (not
individual endusers) can take advantage of the feature if they
wanted to.



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-28 Thread Ævar Arnfjörð Bjarmason


On Wed, Nov 28 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> What do you think about some patch like that which retains the plumbing
>> behavior for things like read-tree, doesn't introduce "precious" or
>> "trashable", and just makes you specify "[checkout|merge|...] --force"
>> in cases where we'd have clobbering?
>
> Whether you like it or not, don't people's automation use tons of
> invocations of "git merge", "git checkout", etc.?  You'd be breaking
> them by such a change.

I'm so sympathetic to this argument that I tried to convince you of
something like this around a year and a half ago:
https://public-inbox.org/git/cacbzzx59kxpoejiuktzln6zjo_xpiwve7xga6q-53j2lwvf...@mail.gmail.com/
:)

I was probing for what your current stance on this sort of thing is,
because discussions like this tend to get bogged down in the irrelevant
distraction of whether something is plumbing or porcelain, which almost
none of our users care about, and we've effectively stopped caring about
ourselves.

But we must have some viable way to repair warts in the tools, and
losing user data is a *big* wart.

I don't think something like the endgame you've described in
https://public-inbox.org/git/xmqqzhtwuhpc@gitster-ct.c.googlers.com/
is ever going to work. Novice git users (the vast majority) are not
going to diligently update both .gitignore and some .gitattribute
mechanism in lockstep. I'd bet most git users haven't read more than a
few paragraphs of our entire documentation at best.

So what's the way forward? I think ultimately we must move to something
where we effectively version the entire CLI UI similar to stable API
versions. I.e. for things like this that would break some things (or
Duy's new "split checkout") introduce them as flags first, then bundle
up all such flags and cut a major release "Git 3, 4, ...", and
eventually remove old functionality.

> Other than that, if we never had Git before and do not have to worry
> about existing users, I'd think it would be a lot closer to the ideal
> than today's system if "checkout  foo.o" rejected overwriting
> "foo.o" that is not tracked in the current index but matches an ignore
> pattern, and required a "--force" option to overwrite it.
>
> A user, during a conflict resolution, may say "I want this 'git
> checkout foo/' to ignore conflicted paths in that directory, so I
> would give "--force" option to it, but now "--force" also implies
> that I am willing to clobber ignored paths, which means I cannot use
> it".
>
> I would think that a proper automation needs per-path hint from the
> user and/or the project, not just a single-size-fits-all --force
> option, and "unlike all the *.o ignored files that are expendable,
> this vendor-supplied-object.o is not" is one way to give such a
> per-path hint.
>
>> This would give scripts which relied on our stable plumbing consistent
>> behavior, while helping users who're using our main porcelain not to
>> lose data. I could then add a --force option to the likes of read-tree
>> (on by default), so you could get porcelain-like behavior with
>> --no-force.
>
> At that low level, I suspect that a single size fits all "--force"
> would work even less well.

Yeah I don't think the one-size-fits-all way out of this is a single
--force flag.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/28/18 3:21 AM, brian m. carlson wrote:

Thanks for the elaboration, Brian - good to get things down to a 
practical, real-world level.

 > [...]
 >
> I point this out to underscore how fundamental this change is.  People
> overwhelmingly do not read the release notes, so expecting people to
> realize that a change has been made, especially when many people only
> upgrade Git because of a security issue, may result in unexpected
> consequences.

This is one of the more important things of software engineering. _Don't 
mix security fixes with breaking changes_. They are very different 
things and like you say, we can't really expect people to real release 
notes for every little incremental release we do.

That's an important part of the SemVer guarantee: a minor version 
bump/patch level increase means "bug fix" or "added functionality in a 
backwards-compatible way". So: no changing of default behavior or 
semantics, but adding a new behavior which is opt-in is perfectly fine.

> Just because we don't think of this use of Git as normal or desirable > 
> doesn't mean people don't do it and don't expect it to keep working.

In other words, we need to be "bug-by-bug" compatible with previous 
versions. :-) What some people would consider a bug, others would 
consider a feature.

> I think any change we make here has to be opt-in, at least until Git
> 3.0.  A config knob would probably be the right way to go. 

Agree. It's less than optimal but I think it's something that we all 
could live with. Deciding to switching the default (or not) is then 
rightfully postponed to a later time, and we can revisit the pros and 
cons then. The important thing now is to get the functionality 
implemented in a good way, tested and eventually merged.
--
Per Lundberg


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> What do you think about some patch like that which retains the plumbing
> behavior for things like read-tree, doesn't introduce "precious" or
> "trashable", and just makes you specify "[checkout|merge|...] --force"
> in cases where we'd have clobbering?

Whether you like it or not, don't people's automation use tons of
invocations of "git merge", "git checkout", etc.?  You'd be breaking
them by such a change.  Other than that, if we never had Git before
and do not have to worry about existing users, I'd think it would be
a lot closer to the ideal than today's system if "checkout 
foo.o" rejected overwriting "foo.o" that is not tracked in the
current index but matches an ignore pattern, and required a
"--force" option to overwrite it.

A user, during a conflict resolution, may say "I want this 'git
checkout foo/' to ignore conflicted paths in that directory, so I
would give "--force" option to it, but now "--force" also implies
that I am willing to clobber ignored paths, which means I cannot use
it".

I would think that a proper automation needs per-path hint from the
user and/or the project, not just a single-size-fits-all --force
option, and "unlike all the *.o ignored files that are expendable,
this vendor-supplied-object.o is not" is one way to give such a
per-path hint.

> This would give scripts which relied on our stable plumbing consistent
> behavior, while helping users who're using our main porcelain not to
> lose data. I could then add a --force option to the likes of read-tree
> (on by default), so you could get porcelain-like behavior with
> --no-force.

At that low level, I suspect that a single size fits all "--force"
would work even less well.



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread brian m. carlson
On Tue, Nov 27, 2018 at 02:50:34PM +, Per Lundberg wrote:
> I agree strongly with this personally; if we must choose between "might
> break automation" and "might delete non-garbage files", I would say the
> former is the lesser evil of the two.
> 
> But, if I had 10 000 000 servers set up using automated scripts that
> would break because of this, I might think differently. Quite likely so,
> in fact.
> 
> What are these automation scenarios _more specifically_? Junio or Brian,
> would you care to elaborate? Is it for build servers where you want "git
> clean -dfx" to always reset the working copy to a pristine state or are
> we talking about some other scenarios?

We had long-running CI servers, since bootstrapping a new system took an
hour.  These would check out the branch to test and run some process
(essentially, a "make" and "make test").  Then, another branch would be
tested, and so on.  The old branch would likely not be merged at this
point.

The scenario I'm thinking of is when a file (say a CSS file) became
built instead of stored in the repository.  Then the file would be added
to .gitignore in the new commit, and it would be generated as part of
the make step.  It would be important to blow away that file away when
checking out a new commit, because not doing so would mean that the CI
system would simply fail to work and require manual intervention.

Moreover, a CI job might fail, due to either a flaky test or a
legitimate failures, so the job might need to be re-run multiple times.
Requiring human intervention, especially when such jobs might be running
at odd hours, would be undesirable.

Another thing we did was to use a specially named gitignore file in our
build step.  We created a new repository, copied the special gitignore
file in as .gitignore, copied in the source and build products, ran git
add and git commit, and then ran git clean -dfx to remove proprietary
source code, packaging the result.  A change to the behavior of git
clean -dfx would be devastating here.

I point this out to underscore how fundamental this change is.  People
overwhelmingly do not read the release notes, so expecting people to
realize that a change has been made, especially when many people only
upgrade Git because of a security issue, may result in unexpected
consequences.  Just because we don't think of this use of Git as normal
or desirable doesn't mean people don't do it and don't expect it to
keep working.  People do read and rely on our documentation.

I think any change we make here has to be opt-in, at least until Git
3.0.  A config knob would probably be the right way to go.  I realize
that may not provide all the benefits we want out of the box, but it
lets people turn the option on once and forget about it.  It also lets
people who don't desire this new behavior explicitly turn it off.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Duy Nguyen
On Tue, Nov 27, 2018 at 1:56 PM Jacob Keller  wrote:
>
> On Tue, Nov 27, 2018 at 1:45 AM Per Lundberg  wrote:
> >
> > On 11/26/18 5:55 PM, Duy Nguyen wrote:
> > > On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
> > >  wrote:
> > >> Some of the solutions overlap with this thing you want, but I think it's
> > >> worth keeping the distinction between the two in mind.
> > >
> > > On the other hand all use cases should be considered. It's going to be
> > > a mess to have "trashable" attribute that applies to some commands
> > > while "precious" to some others (and even worse when they overlap,
> > > imagine having to define both in .gitattributes)
> >
> > Agree - I think it would be a very bad idea to have a "mix" of both
> > trashable and precious. IMO, we should try to find which one of these
> > concepts suits most general use cases best and causes less churn for
> > existing scripts/users' existing "semantic expectations", and pick that one.
> > --
> > Per Lundberg
>
> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
>
> Either Duy's solution with a sort of "untracked" reflog, or the
> garbage/trashable notion.
>
> I don't like the idea of precious because it means people have to know
> and remember to opt in, and it's quite possible they will not do so
> until after they've lost real data.
>
> I'd only have trashable apply in the case where it was implicit. i.e.
> git clean -fdx would still delete them, as this is an explicit
> operation that (hopefully?) users know will delete data.

Yes I know it will delete ignored files. But I don't want it to delete
some files. There is no way I can tell Git to do that.

It's the same with merge/checkout's overwriting problem. Once the
initial surprise is over, I want control over what files I want Git to
just delete and not annoy me, what Git should not delete.
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 26 2018, Junio C Hamano wrote:

> Per Lundberg  writes:
>
>> How about something like this:
>> ...
>> Would this be a reasonable compromise for everybody?
>
> I do not think you'd need to introduce such a deliberately breaking
> change at all.  Just introduce a new "precious" class, perhaps mark
> them with the atttribute mechanism, and that would be the endgame.
> Early adopters would start marking ignored but not expendable paths
> with the "precious" attribute and they won't be clobbered.  As the
> mechanism becomes widely known and mature, more and more people use
> it.  And even after that happens, early adopters do not have to change
> any attribute setting, and late adopters would have plenty of examples
> to imitate.  Those who do not need any "precious" class do not have
> to do anything and you won't break any existing automation that way.

The patch I submitted in <87zhuf3gs0@evledraar.gmail.com>[1] changed
the behavior for read-tree & checkout & merge etc.

It was an RFC more in the spirit of showing what in our current tests
had to change to spur some discussion.

But I'm very sympathetic to this line of argument. I.e. in my patch I'm
changing the semantics of read-tree, which is plumbing.

What do you think about some patch like that which retains the plumbing
behavior for things like read-tree, doesn't introduce "precious" or
"trashable", and just makes you specify "[checkout|merge|...] --force"
in cases where we'd have clobbering?

This would give scripts which relied on our stable plumbing consistent
behavior, while helping users who're using our main porcelain not to
lose data. I could then add a --force option to the likes of read-tree
(on by default), so you could get porcelain-like behavior with
--no-force.

1. https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/27/18 2:55 PM, Jacob Keller wrote:

> Personally, I would rather err on the side which requires the least
> interaction from users to avoid silently clobbering an ignored file.
> 
 > [...]
> 
> I don't like the idea of precious because it means people have to know
> and remember to opt in, and it's quite possible they will not do so
> until after they've lost real data.

I agree strongly with this personally; if we must choose between "might 
break automation" and "might delete non-garbage files", I would say the 
former is the lesser evil of the two.

But, if I had 10 000 000 servers set up using automated scripts that 
would break because of this, I might think differently. Quite likely so, 
in fact.

What are these automation scenarios _more specifically_? Junio or Brian, 
would you care to elaborate? Is it for build servers where you want "git 
clean -dfx" to always reset the working copy to a pristine state or are 
we talking about some other scenarios?

> I'd only have trashable apply in the case where it was implicit. i.e.
> git clean -fdx would still delete them, as this is an explicit
> operation that (hopefully?) users know will delete data.

This is one of the tougher calls, unfortunately.

If I was a user (which I am), and I was typing "git clean -dfx", what 
would I expect?

The help text (currently) states "-x   remove ignored files, too".

Would it be safe to assume that people would understand that "ignored 
_does not_ mean trashable when doing "git checkout some-ref" BUT it 
_does_ mean trashable in the "git clean -dfx" context"? I'm not so 
certain. It would be one of those perceived inconsistencies that would 
make people scream in anger because they _presumed_ that with the new 
"trashable" concept, "git clean -dfx" would no longer hit them in the leg.

And the other way around: if we change "git clean -dfx" to _not_ treat 
"ignored == trashable", it is likely to "hose automation" as it has been 
previously stated. People who might be using this syntax and _want_ it 
to remove ignored files would be upset, and rightfully so.

So in my POV, it's a tough decision between two, less-than-optimal 
alternatives.

But I would perhaps be able to live with the current semantics for "git 
clean -dfx" _as long as we update the help text_ so that "-x" indicates 
more clearly that non-trashable files can be deleted. It doesn't make 
things _worse_ than they currently are and if this is what it takes to 
get the trashable concept implemented and accepted by the community, 
it's a compromise I'd be willing to make.
--
Per Lundberg



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Jacob Keller
On Tue, Nov 27, 2018 at 1:45 AM Per Lundberg  wrote:
>
> On 11/26/18 5:55 PM, Duy Nguyen wrote:
> > On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
> >  wrote:
> >> Some of the solutions overlap with this thing you want, but I think it's
> >> worth keeping the distinction between the two in mind.
> >
> > On the other hand all use cases should be considered. It's going to be
> > a mess to have "trashable" attribute that applies to some commands
> > while "precious" to some others (and even worse when they overlap,
> > imagine having to define both in .gitattributes)
>
> Agree - I think it would be a very bad idea to have a "mix" of both
> trashable and precious. IMO, we should try to find which one of these
> concepts suits most general use cases best and causes less churn for
> existing scripts/users' existing "semantic expectations", and pick that one.
> --
> Per Lundberg

Personally, I would rather err on the side which requires the least
interaction from users to avoid silently clobbering an ignored file.

Either Duy's solution with a sort of "untracked" reflog, or the
garbage/trashable notion.

I don't like the idea of precious because it means people have to know
and remember to opt in, and it's quite possible they will not do so
until after they've lost real data.

I'd only have trashable apply in the case where it was implicit. i.e.
git clean -fdx would still delete them, as this is an explicit
operation that (hopefully?) users know will delete data.

Thanks,
Jake


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-27 Thread Per Lundberg
On 11/26/18 5:55 PM, Duy Nguyen wrote:
> On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
>  wrote:
>> Some of the solutions overlap with this thing you want, but I think it's
>> worth keeping the distinction between the two in mind.
> 
> On the other hand all use cases should be considered. It's going to be
> a mess to have "trashable" attribute that applies to some commands
> while "precious" to some others (and even worse when they overlap,
> imagine having to define both in .gitattributes)

Agree - I think it would be a very bad idea to have a "mix" of both 
trashable and precious. IMO, we should try to find which one of these 
concepts suits most general use cases best and causes less churn for 
existing scripts/users' existing "semantic expectations", and pick that one.
--
Per Lundberg


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Eckhard Maaß
On Mon, Nov 12, 2018 at 11:22:09PM +, brian m. carlson wrote:
> This is going to totally hose automation.  My last job had files which
> might move from tracked to untracked (a file that had become generated),
> and long-running CI and build systems would need to be able to check out
> one status and switch to the other.  Your proposed change will prevent
> those systems from working, whereas they previously did.

Wouldn't those systems not use -f right now? And shouldn't Git have the
same semantic for -f to clobber everything in the proposed use case?
Like it does right now for untracked files which are not ignored. So to
be save going back and forth I would expect those systems to use -f
anyway. Have I missed something here?

Regards,
Eckhard


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Duy Nguyen
On Mon, Nov 26, 2018 at 4:47 PM Ævar Arnfjörð Bjarmason
 wrote:
> >> >> How about something like this:
> >> >>
> >> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
> >> >> delete" without prompting.
> >> >>
> >> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> >> >> making the new behavior _opt in_ to avoid breaking automated
> >> >> systems/existing scripts for anyone. Put the setting for this behind a
> >> >> new core.* config flag.
> >> >>
> >> >> 3. In the plan for version 3.0 (a new major version where some breakage
> >> >> can be tolerable, according to Semantic Versioning), change the default
> >> >> so that "only explicit garbage is garbage". Include very clear notices
> >> >> of this in the release notes. The config flag is retained, but its
> >> >> default changes from true->false or vice versa. People who dislike the
> >> >> new behavior can easily change back to the 2.x semantics.
> >> >
> >> > How does this garbage thing interact with "git clean -x"? My
> >> > interpretation of this flag/attribute is that at version 3.0 by
> >> > default all ignored files are _not_ garbage, so "git clean -x" should
> >> > not remove any of them. Which is weird because most of ignored files
> >> > are like *.o that should be removed.
> >> >
> >> > I also need to mark "precious" on untracked or even tracked files (*).
> >> > Not sure how this "garbage" attribute interacts with that.
> >> >
> >> > (*) I was hoping I could get the idea [1] implemented in somewhat good
> >> > shape before presenting here. But I'm a bit slow on that front. So
> >> > yeah this "precious" on untracked/tracked thingy may be even
> >> > irrelevant if the patch series will be rejected.
> >>
> >> I think a garbage (or trashable) flag, if implemented, wouldn't need any
> >> special case in git-clean, i.e. -x would remove all untracked files,
> >> whether ignored or garbage/trashable. That's what my patch to implement
> >> it does:
> >> https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/
> >>
> >> I think that makes sense. Users running "git clean" have "--dry-run" and
> >> unlike "checkout a branch" or "merge this commit" where we'll now shred
> >> data implicitly it's obvious that git-clean is going to shred your data.
> >
> > Then that't not what I want. If I'm going to mark to keep "config.mak"
> > around, I'm not going to carefully move it away before doing "git
> > clean -fdx" then move it back. No "git clean --dry-run" telling me to
> > make a backup of config.mak is no good.
>
> Understood. I mean this in the context of solving the problem users have
> with running otherwise non-data-destroying commands like "checkout" and
> "merge" and getting their data destroyed, which is overwhelmingly why
> this topic gets resurrected.
>
> Some of the solutions overlap with this thing you want, but I think it's
> worth keeping the distinction between the two in mind.

On the other hand all use cases should be considered. It's going to be
a mess to have "trashable" attribute that applies to some commands
while "precious" to some others (and even worse when they overlap,
imagine having to define both in .gitattributes)

> I.e. I can
> imagine us finding some acceptable solution to the data shredding
> problem that doesn't implement this mode for "git-clean", or the other
> way around.
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 26 2018, Duy Nguyen wrote:

> On Mon, Nov 26, 2018 at 4:34 PM Ævar Arnfjörð Bjarmason
>  wrote:
>>
>>
>> On Mon, Nov 26 2018, Duy Nguyen wrote:
>>
>> > On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg  
>> > wrote:
>> >>
>> >> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> >> > This is going to totally hose automation.  My last job had files which
>> >> > might move from tracked to untracked (a file that had become generated),
>> >> > and long-running CI and build systems would need to be able to check out
>> >> > one status and switch to the other.  Your proposed change will prevent
>> >> > those systems from working, whereas they previously did.
>> >> >
>> >> > I agree that your proposal would have been a better design originally,
>> >> > but breaking the way automated systems currently work is probably going
>> >> > to be a dealbreaker.
>> >>
>> >> How about something like this:
>> >>
>> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
>> >> delete" without prompting.
>> >>
>> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
>> >> making the new behavior _opt in_ to avoid breaking automated
>> >> systems/existing scripts for anyone. Put the setting for this behind a
>> >> new core.* config flag.
>> >>
>> >> 3. In the plan for version 3.0 (a new major version where some breakage
>> >> can be tolerable, according to Semantic Versioning), change the default
>> >> so that "only explicit garbage is garbage". Include very clear notices
>> >> of this in the release notes. The config flag is retained, but its
>> >> default changes from true->false or vice versa. People who dislike the
>> >> new behavior can easily change back to the 2.x semantics.
>> >
>> > How does this garbage thing interact with "git clean -x"? My
>> > interpretation of this flag/attribute is that at version 3.0 by
>> > default all ignored files are _not_ garbage, so "git clean -x" should
>> > not remove any of them. Which is weird because most of ignored files
>> > are like *.o that should be removed.
>> >
>> > I also need to mark "precious" on untracked or even tracked files (*).
>> > Not sure how this "garbage" attribute interacts with that.
>> >
>> > (*) I was hoping I could get the idea [1] implemented in somewhat good
>> > shape before presenting here. But I'm a bit slow on that front. So
>> > yeah this "precious" on untracked/tracked thingy may be even
>> > irrelevant if the patch series will be rejected.
>>
>> I think a garbage (or trashable) flag, if implemented, wouldn't need any
>> special case in git-clean, i.e. -x would remove all untracked files,
>> whether ignored or garbage/trashable. That's what my patch to implement
>> it does:
>> https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/
>>
>> I think that makes sense. Users running "git clean" have "--dry-run" and
>> unlike "checkout a branch" or "merge this commit" where we'll now shred
>> data implicitly it's obvious that git-clean is going to shred your data.
>
> Then that't not what I want. If I'm going to mark to keep "config.mak"
> around, I'm not going to carefully move it away before doing "git
> clean -fdx" then move it back. No "git clean --dry-run" telling me to
> make a backup of config.mak is no good.

Understood. I mean this in the context of solving the problem users have
with running otherwise non-data-destroying commands like "checkout" and
"merge" and getting their data destroyed, which is overwhelmingly why
this topic gets resurrected.

Some of the solutions overlap with this thing you want, but I think it's
worth keeping the distinction between the two in mind. I.e. I can
imagine us finding some acceptable solution to the data shredding
problem that doesn't implement this mode for "git-clean", or the other
way around.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Duy Nguyen
On Mon, Nov 26, 2018 at 4:34 PM Ævar Arnfjörð Bjarmason
 wrote:
>
>
> On Mon, Nov 26 2018, Duy Nguyen wrote:
>
> > On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg  wrote:
> >>
> >> On 11/13/18 1:22 AM, brian m. carlson wrote:
> >> > This is going to totally hose automation.  My last job had files which
> >> > might move from tracked to untracked (a file that had become generated),
> >> > and long-running CI and build systems would need to be able to check out
> >> > one status and switch to the other.  Your proposed change will prevent
> >> > those systems from working, whereas they previously did.
> >> >
> >> > I agree that your proposal would have been a better design originally,
> >> > but breaking the way automated systems currently work is probably going
> >> > to be a dealbreaker.
> >>
> >> How about something like this:
> >>
> >> 1. Introduce a concept with "garbage" files, which git is "permitted to
> >> delete" without prompting.
> >>
> >> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> >> making the new behavior _opt in_ to avoid breaking automated
> >> systems/existing scripts for anyone. Put the setting for this behind a
> >> new core.* config flag.
> >>
> >> 3. In the plan for version 3.0 (a new major version where some breakage
> >> can be tolerable, according to Semantic Versioning), change the default
> >> so that "only explicit garbage is garbage". Include very clear notices
> >> of this in the release notes. The config flag is retained, but its
> >> default changes from true->false or vice versa. People who dislike the
> >> new behavior can easily change back to the 2.x semantics.
> >
> > How does this garbage thing interact with "git clean -x"? My
> > interpretation of this flag/attribute is that at version 3.0 by
> > default all ignored files are _not_ garbage, so "git clean -x" should
> > not remove any of them. Which is weird because most of ignored files
> > are like *.o that should be removed.
> >
> > I also need to mark "precious" on untracked or even tracked files (*).
> > Not sure how this "garbage" attribute interacts with that.
> >
> > (*) I was hoping I could get the idea [1] implemented in somewhat good
> > shape before presenting here. But I'm a bit slow on that front. So
> > yeah this "precious" on untracked/tracked thingy may be even
> > irrelevant if the patch series will be rejected.
>
> I think a garbage (or trashable) flag, if implemented, wouldn't need any
> special case in git-clean, i.e. -x would remove all untracked files,
> whether ignored or garbage/trashable. That's what my patch to implement
> it does:
> https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/
>
> I think that makes sense. Users running "git clean" have "--dry-run" and
> unlike "checkout a branch" or "merge this commit" where we'll now shred
> data implicitly it's obvious that git-clean is going to shred your data.

Then that't not what I want. If I'm going to mark to keep "config.mak"
around, I'm not going to carefully move it away before doing "git
clean -fdx" then move it back. No "git clean --dry-run" telling me to
make a backup of config.mak is no good.
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 26 2018, Duy Nguyen wrote:

> On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg  wrote:
>>
>> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> > This is going to totally hose automation.  My last job had files which
>> > might move from tracked to untracked (a file that had become generated),
>> > and long-running CI and build systems would need to be able to check out
>> > one status and switch to the other.  Your proposed change will prevent
>> > those systems from working, whereas they previously did.
>> >
>> > I agree that your proposal would have been a better design originally,
>> > but breaking the way automated systems currently work is probably going
>> > to be a dealbreaker.
>>
>> How about something like this:
>>
>> 1. Introduce a concept with "garbage" files, which git is "permitted to
>> delete" without prompting.
>>
>> 2. Retain the current default, i.e. "ignored files are garbage" for now,
>> making the new behavior _opt in_ to avoid breaking automated
>> systems/existing scripts for anyone. Put the setting for this behind a
>> new core.* config flag.
>>
>> 3. In the plan for version 3.0 (a new major version where some breakage
>> can be tolerable, according to Semantic Versioning), change the default
>> so that "only explicit garbage is garbage". Include very clear notices
>> of this in the release notes. The config flag is retained, but its
>> default changes from true->false or vice versa. People who dislike the
>> new behavior can easily change back to the 2.x semantics.
>
> How does this garbage thing interact with "git clean -x"? My
> interpretation of this flag/attribute is that at version 3.0 by
> default all ignored files are _not_ garbage, so "git clean -x" should
> not remove any of them. Which is weird because most of ignored files
> are like *.o that should be removed.
>
> I also need to mark "precious" on untracked or even tracked files (*).
> Not sure how this "garbage" attribute interacts with that.
>
> (*) I was hoping I could get the idea [1] implemented in somewhat good
> shape before presenting here. But I'm a bit slow on that front. So
> yeah this "precious" on untracked/tracked thingy may be even
> irrelevant if the patch series will be rejected.

I think a garbage (or trashable) flag, if implemented, wouldn't need any
special case in git-clean, i.e. -x would remove all untracked files,
whether ignored or garbage/trashable. That's what my patch to implement
it does:
https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/

I think that makes sense. Users running "git clean" have "--dry-run" and
unlike "checkout a branch" or "merge this commit" where we'll now shred
data implicitly it's obvious that git-clean is going to shred your data.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Duy Nguyen
On Mon, Nov 26, 2018 at 10:30 AM Per Lundberg  wrote:
>
> On 11/13/18 1:22 AM, brian m. carlson wrote:
> > This is going to totally hose automation.  My last job had files which
> > might move from tracked to untracked (a file that had become generated),
> > and long-running CI and build systems would need to be able to check out
> > one status and switch to the other.  Your proposed change will prevent
> > those systems from working, whereas they previously did.
> >
> > I agree that your proposal would have been a better design originally,
> > but breaking the way automated systems currently work is probably going
> > to be a dealbreaker.
>
> How about something like this:
>
> 1. Introduce a concept with "garbage" files, which git is "permitted to
> delete" without prompting.
>
> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> making the new behavior _opt in_ to avoid breaking automated
> systems/existing scripts for anyone. Put the setting for this behind a
> new core.* config flag.
>
> 3. In the plan for version 3.0 (a new major version where some breakage
> can be tolerable, according to Semantic Versioning), change the default
> so that "only explicit garbage is garbage". Include very clear notices
> of this in the release notes. The config flag is retained, but its
> default changes from true->false or vice versa. People who dislike the
> new behavior can easily change back to the 2.x semantics.

How does this garbage thing interact with "git clean -x"? My
interpretation of this flag/attribute is that at version 3.0 by
default all ignored files are _not_ garbage, so "git clean -x" should
not remove any of them. Which is weird because most of ignored files
are like *.o that should be removed.

I also need to mark "precious" on untracked or even tracked files (*).
Not sure how this "garbage" attribute interacts with that.

(*) I was hoping I could get the idea [1] implemented in somewhat good
shape before presenting here. But I'm a bit slow on that front. So
yeah this "precious" on untracked/tracked thingy may be even
irrelevant if the patch series will be rejected.

[1] 
https://public-inbox.org/git/cacsjy8c3rofv0kqejrwufqqzbnfu4msxjtpheybgmmrroff...@mail.gmail.com/

> Would this be a reasonable compromise for everybody?
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Junio C Hamano
Per Lundberg  writes:

> How about something like this:
> ...
> Would this be a reasonable compromise for everybody?

I do not think you'd need to introduce such a deliberately breaking
change at all.  Just introduce a new "precious" class, perhaps mark
them with the atttribute mechanism, and that would be the endgame.
Early adopters would start marking ignored but not expendable paths
with the "precious" attribute and they won't be clobbered.  As the
mechanism becomes widely known and mature, more and more people use
it.  And even after that happens, early adopters do not have to change
any attribute setting, and late adopters would have plenty of examples
to imitate.  Those who do not need any "precious" class do not have
to do anything and you won't break any existing automation that way.





Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 26 2018, Per Lundberg wrote:

> On 11/13/18 1:22 AM, brian m. carlson wrote:
>> This is going to totally hose automation.  My last job had files which
>> might move from tracked to untracked (a file that had become generated),
>> and long-running CI and build systems would need to be able to check out
>> one status and switch to the other.  Your proposed change will prevent
>> those systems from working, whereas they previously did.
>>
>> I agree that your proposal would have been a better design originally,
>> but breaking the way automated systems currently work is probably going
>> to be a dealbreaker.
>
> How about something like this:
>
> 1. Introduce a concept with "garbage" files, which git is "permitted to
> delete" without prompting.
>
> 2. Retain the current default, i.e. "ignored files are garbage" for now,
> making the new behavior _opt in_ to avoid breaking automated
> systems/existing scripts for anyone. Put the setting for this behind a
> new core.* config flag.
>
> 3. In the plan for version 3.0 (a new major version where some breakage
> can be tolerable, according to Semantic Versioning), change the default
> so that "only explicit garbage is garbage". Include very clear notices
> of this in the release notes. The config flag is retained, but its
> default changes from true->false or vice versa. People who dislike the
> new behavior can easily change back to the 2.x semantics.
>
> Would this be a reasonable compromise for everybody?

Possibly, but I think there's an earlier step zero there for anyone
interested in pursuing this (and currently I can't make time for it),
which is to submit a patch with tests and documentation showing exactly
the sort of scenarios where we clobber or don't clobber existing files.

As my https://public-inbox.org/git/87zhuf3gs0@evledraar.gmail.com/
shows we have tests for this, but they're not explicit, and some want to
test some unrelated thing.

I.e. to test the cases where we clobber foo.c because foo.c now
explicitly exists, or cases where dir/foo.c is clobbered because "dir"
is now a tracked text file etc., are those the only two cases? I vaguely
suspect that there were other interesting cases, but at this point the
information has been paged out of the working set of my wetware. The
thread at
https://public-inbox.org/git/87o9au39s7@evledraar.gmail.com/ has
some notes about this.

Then as noted in
https://public-inbox.org/git/87wopj3661@evledraar.gmail.com/ the
reason we have this behavior seems to be something that grew organically
from a semi-related bugfix.

So I don't think we're at a point where we're all dug into our trenches
and some people want X and others want Y and in the name of backwards
compatibility we're going to stay with X. It may turn out that we just
want to retain 10% of X, and can get 99% of the safety of Y by doing
that.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-26 Thread Per Lundberg
On 11/13/18 1:22 AM, brian m. carlson wrote:
> This is going to totally hose automation.  My last job had files which
> might move from tracked to untracked (a file that had become generated),
> and long-running CI and build systems would need to be able to check out
> one status and switch to the other.  Your proposed change will prevent
> those systems from working, whereas they previously did.
> 
> I agree that your proposal would have been a better design originally,
> but breaking the way automated systems currently work is probably going
> to be a dealbreaker.

How about something like this:

1. Introduce a concept with "garbage" files, which git is "permitted to 
delete" without prompting.

2. Retain the current default, i.e. "ignored files are garbage" for now, 
making the new behavior _opt in_ to avoid breaking automated 
systems/existing scripts for anyone. Put the setting for this behind a 
new core.* config flag.

3. In the plan for version 3.0 (a new major version where some breakage 
can be tolerable, according to Semantic Versioning), change the default 
so that "only explicit garbage is garbage". Include very clear notices 
of this in the release notes. The config flag is retained, but its 
default changes from true->false or vice versa. People who dislike the 
new behavior can easily change back to the 2.x semantics.

Would this be a reasonable compromise for everybody?
-- 
Per Lundberg



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread brian m. carlson
On Sun, Nov 11, 2018 at 01:33:44PM +0100, Ævar Arnfjörð Bjarmason wrote:
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
> 
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
> 
> "Hey, I 100% understood .gitignore semantics including this one part
> of the docs where you say you'll do this, but just forgot one day
> and deleted my work. Can we get some more safety?"
> 
> But rather (with some hyperbole for effect):
> 
> "ZOMG git deleted my file! Is this a bug??"
> 
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
> 
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.

This is going to totally hose automation.  My last job had files which
might move from tracked to untracked (a file that had become generated),
and long-running CI and build systems would need to be able to check out
one status and switch to the other.  Your proposed change will prevent
those systems from working, whereas they previously did.

I agree that your proposal would have been a better design originally,
but breaking the way automated systems currently work is probably going
to be a dealbreaker.
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Duy Nguyen
On Sun, Nov 11, 2018 at 2:06 PM Ævar Arnfjörð Bjarmason
 wrote:
> +Trashable files
> +~~~
> +
> +`trashable`
> +^^
> +
> +Provides an escape hatch for re-enabling a potentially data destroying
> +feature which was enabled by default between Git versions 1.5.2 and
> +2.20. See the `NOTES` section of linkgit:gitignore[5] for details.

How does this interact with "git clean -x"? Most ignored files will
not have trashable attribute, so we don't remove any of them? Making
"git clean" completely ignore this attribute is also possible, I
guess, if we rename it somehow to avoid confusion.
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Duy Nguyen
On Mon, Nov 12, 2018 at 10:09 AM Matthieu Moy  wrote:
> May I remind an idea I sugested in an old thread: add an intermediate level
> where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
> backup files):
>
> https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/
>
> One advantage of the "rename" behavior is that it's safer that the current,
> but still not very disturbing for people who like the current behavior. This
> makes it a good candidate for a default behavior.

I have something else in the bag that does something like this. The
idea is that we go ahead and do destructive things but we let the user
undo.

Some more background in [1] but basically we hash "every" change and
store in the object database (in this case we store "foo" content
before overwriting it). We maintain a list of these hashes so that
undo is possible, but of course we don't keep infinite change history,
eventually too old changes will be pruned. [1] talks about index
changes (e.g. "git add -p") but it could apply to worktree changes as
well (and I'm also eyeing $GIT_DIR/config changes).

The upside: a similar undo mechanism that works for more than just
this case and it allows undoing multiple times while foo~ only allow
once. The downside: hashing is definitely heavier than renaming foo to
foo~. So this will feature be opt-in in most cases. But for
"dangerous" overwrite like this case, I think we value the file
content more and make it opt-out.

[1] 
https://public-inbox.org/git/CACsJy8A3QCYY6QeJQYkbCKYh=7Q7pj=rer_oqhlgoamqtno...@mail.gmail.com/
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

>> Or only selected "*.o" (vendor supplied binary blob) kept tracked
>> while everything else is built from the source.
>> ...
> But it also does get used for "mostly we don't want this file, but
> sometimes we do" use-case, so that's something we need to deal with in
> practice.

Exactly.  "Mostly we don't want *.o as we prefer to build from the
source, but we have only object files for some selected ones" is an
often cited use case where it is the BCP to have *.o in .gitignore
and use "add -f" to add the "selected" ones initially.



Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Junio C Hamano wrote:

> Ævar Arnfjörð Bjarmason  writes:
>
>> What I'd add to your list is:
>>
>> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>>   else usually doesn't belong in the repo as a "soft ignore". This is
>>   something we've never recommended, but have implicitly supported since
>>   the only caveats are a) you need a one-off "git add -f" and then
>>   they're tracked b) them being missing from "status" before being
>>   tracked c) the issue under discussion here.
>
> Or only selected "*.o" (vendor supplied binary blob) kept tracked
> while everything else is built from the source.
>
> I do not know who you are referring to "we" in your sentence, but as
> far as I am concerned, it has been and still is a BCP recommendation
> on this list to deal with a case like that.

I mean that this use-case of having a "soft" ignore by carrying it
across the "git add" barrier with a one-off "-f" isn't something
explicitly documented, and apparently not something many
expect. I.e. you / Matthieu have mentioned .gitignore in the past for
only-generated *.o use-case.

But it also does get used for "mostly we don't want this file, but
sometimes we do" use-case, so that's something we need to deal with in
practice. Like many workflows in git it's not something that was forseen
or intended, but does happen in the wild.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Junio C Hamano
Ævar Arnfjörð Bjarmason  writes:

> What I'd add to your list is:
>
> * Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
>   else usually doesn't belong in the repo as a "soft ignore". This is
>   something we've never recommended, but have implicitly supported since
>   the only caveats are a) you need a one-off "git add -f" and then
>   they're tracked b) them being missing from "status" before being
>   tracked c) the issue under discussion here.

Or only selected "*.o" (vendor supplied binary blob) kept tracked
while everything else is built from the source.

I do not know who you are referring to "we" in your sentence, but as
far as I am concerned, it has been and still is a BCP recommendation
on this list to deal with a case like that.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Matthieu Moy
"Per Lundberg"  wrote:

> On 11/11/18 5:41 PM, Duy Nguyen wrote:
> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
> >  wrote:
>
> >> That will lose no data, and in the very rare cases where a checkout of
> >> tracked files would overwrite an ignored pattern, we can just error out
> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
> >> to delete the files to proceed.
> > There's also the other side of the coin. If this refuse to overwrite
> > triggers too often, it can become an annoyance.

I may have missed some cases, but to me the cases when checkout may try
to overwrite an ignored file are essentially:

* Someone "git add"ed a file meant to be ignored by mistake (e.g.
  "git add -f *.o").

* A file that was meant to be kept private (e.g. config.mak.dev) ends
  up being tracked. This may happen when we find a way to make per-developer
  settings the same for everyone.

I both cases I'd want at least to be notified that something is going on,
and in the second I'd probably want to keep my local file around.

> If we feel thrashable is stretching it too far (which I don't think it
> is), we could add a "core.ignore_files_are_trashable" setting that
> brings back the old semantics, for those who have a strong feeling about it.

May I remind an idea I sugested in an old thread: add an intermediate level
where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
backup files):

https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/

One advantage of the "rename" behavior is that it's safer that the current,
but still not very disturbing for people who like the current behavior. This
makes it a good candidate for a default behavior.

This could come in complement with this thread's "precious" concept:

* If you know what you're doing and know that such or such file is precious,
  mark it as such and Git will never overwrite it.

* If you don't know about precious files, just keep the default setting and
  the worse that can happen is to get your file overwritten with a bakup
  of the old version kept around.

This would probably play better with a notion of "precious" files than with
a notion of "trashable" files.

-- 
Matthieu Moy
https://matthieu-moy.fr/


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-12 Thread Ævar Arnfjörð Bjarmason


On Mon, Nov 12 2018, Matthieu Moy wrote:

> "Per Lundberg"  wrote:
>
>> On 11/11/18 5:41 PM, Duy Nguyen wrote:
>> > On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
>> >  wrote:
>>
>> >> That will lose no data, and in the very rare cases where a checkout of
>> >> tracked files would overwrite an ignored pattern, we can just error out
>> >> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> >> to delete the files to proceed.
>> > There's also the other side of the coin. If this refuse to overwrite
>> > triggers too often, it can become an annoyance.
>
> I may have missed some cases, but to me the cases when checkout may try
> to overwrite an ignored file are essentially:
>
> * Someone "git add"ed a file meant to be ignored by mistake (e.g.
>   "git add -f *.o").
>
> * A file that was meant to be kept private (e.g. config.mak.dev) ends
>   up being tracked. This may happen when we find a way to make per-developer
>   settings the same for everyone.

Yes, the cases under discussion here are all cases where a tracked file
clobbers a file matching a pattern in in .gitignore.

What I'd add to your list is:

* Some projects (I've seen this in the wild) add e.g. *.mp3 or whatever
  else usually doesn't belong in the repo as a "soft ignore". This is
  something we've never recommended, but have implicitly supported since
  the only caveats are a) you need a one-off "git add -f" and then
  they're tracked b) them being missing from "status" before being
  tracked c) the issue under discussion here.

* Cases where e.g. filename changes to a directory or globs because of
  that make this more complex.

> I both cases I'd want at least to be notified that something is going on,
> and in the second I'd probably want to keep my local file around.
>> If we feel thrashable is stretching it too far (which I don't think it
>> is), we could add a "core.ignore_files_are_trashable" setting that
>> brings back the old semantics, for those who have a strong feeling about it.
>
> May I remind an idea I sugested in an old thread: add an intermediate level
> where ignored files to be overwritten are renamed (eg. foo -> foo~ like Emacs'
> backup files):
>
> https://public-inbox.org/git/vpqd3t9656k@bauges.imag.fr/
>
> One advantage of the "rename" behavior is that it's safer that the current,
> but still not very disturbing for people who like the current behavior. This
> makes it a good candidate for a default behavior.
>
> This could come in complement with this thread's "precious" concept:
>
> * If you know what you're doing and know that such or such file is precious,
>   mark it as such and Git will never overwrite it.
>
> * If you don't know about precious files, just keep the default setting and
>   the worse that can happen is to get your file overwritten with a bakup
>   of the old version kept around.
>
> This would probably play better with a notion of "precious" files than with
> a notion of "trashable" files.

I used to think this foo -> foo~ approach made the most sense (and said
as much in
https://public-inbox.org/git/871s8qdzph@evledraar.gmail.com/) but I
think it's probably best not to do it and just error out, because:

 * We'd still need to handle the cases where "tests" the file collides
   with "tests" the directory. Then where do we move the colliding file?
   ~/.git/lost+found/* ? We could handle the subdir case with another
   special-case though...

 * I think such silent action will just leave users more confused, and
   in many cases (e.g. a merge) whatever message we print out will be
   missed in a deluge of other messaging, and they'll probably miss it.

   I'd like to avoid a case where a bunch of *~ files get committed
   because the user's workflow is (and some beginner git users do this):

   git pull && git add . && git commit && git push

   As the "pull" would now invoke a merge that would do this rename.

 * If I have the "foo" file open in my editor (a plausible way to run
   into this) I switch to another terminal, do the merge, miss the
   message, then re-save "foo". Now I have both "foo" and "foo~"
   on-disk. Another case where we should just refuse until the user
   resolves the situation to avoid the confusion.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Per Lundberg
On 11/11/18 5:41 PM, Duy Nguyen wrote:
> On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
>  wrote:
 >
>> That will lose no data, and in the very rare cases where a checkout of
>> tracked files would overwrite an ignored pattern, we can just error out
>> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> to delete the files to proceed. 
> There's also the other side of the coin. If this refuse to overwrite
> triggers too often, it can become an annoyance.

Sure, but doing "git checkout -f some_ref" instead of "git checkout 
some_ref" isn't really _that_ annoying, is it? I think, people (because 
of not having read/studied the .gitignore semantics well enough) having 
their files being overwritten _without realizing it_ is a bigger danger. 
But obviously there is a bit of treading a thin line here.

If we feel thrashable is stretching it too far (which I don't think it 
is), we could add a "core.ignore_files_are_trashable" setting that 
brings back the old semantics, for those who have a strong feeling about it.

It's also quite possible to do it the other way around - i.e. set 
"core.ignore_files_are_trashable" to true by default, and let the "new" 
behavior be opt-in. However, this might "miss the mark" in that those 
people who would really benefit from the new semantics might miss this 
setting, just like they could risk missing the "precious" setting.

(I also think "trashable" sounds better and is more clear & precise than 
"precious", for whatever that is worth.)
--
Per Lundberg


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Ævar Arnfjörð Bjarmason


On Sun, Nov 11 2018, Duy Nguyen wrote:

> On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
>  wrote:
>> The users who need protection against git deleting their files the most
>> are exactly the sort of users who aren't expert-level enough to
>> understand the nuances of how the semantics of .gitignore and "precious"
>> are going to interact before git eats their data.
>>
>> This is pretty apparent from the bug reports we're getting about
>> this. None of them are:
>>
>> "Hey, I 100% understood .gitignore semantics including this one part
>> of the docs where you say you'll do this, but just forgot one day
>> and deleted my work. Can we get some more safety?"
>>
>> But rather (with some hyperbole for effect):
>>
>> "ZOMG git deleted my file! Is this a bug??"
>>
>> So I think we should have the inverse of this "precious"
>> attribute". Just a change to the docs to say that .gitignore doesn't
>> imply these eager deletion semantics on tree unpacking anymore, and if
>> users want it back they can define a "garbage" attribute
>> (s/precious/garbage/).
>>
>> That will lose no data, and in the very rare cases where a checkout of
>> tracked files would overwrite an ignored pattern, we can just error out
>> (as we do with the "Ok to overwrite" branch removed) and tell the user
>> to delete the files to proceed.
>
> There's also the other side of the coin. If this refuse to overwrite
> triggers too often, it can become an annoyance. So far I've seen two
> reports of accident overwriting which make me think turning precious
> to trashable may be too extreme. Plus ignored files are trashable by
> default (or at least by design so far), adding trashable attribute
> changes how we handle ignored files quite significantly.

Yeah I'm not trying to make the argument that we should just go with
these user bug reports, clearly that's just going to give us selection
bias and we could continue to flip between the two behaviors with that
approach. Just that an advanced opt-in feature to prevent dataloss will
not prevent it in practice.

Is taking my patch the right thing? I don't know. I'm leaning in that
direction, but more making a devil's advocate argument to see if anyone
finds good cases that'll demonstrate how it's bad. I haven't read/seen
them so far, and the test suite didn't have any.

I did go through the list archives as Junio suggested in
https://public-inbox.org/git/7viq39avay@alter.siamese.dyndns.org/
and found these two:
https://public-inbox.org/git/?q=d%3A20070301..20070331+verify_absent

It seems to me that the reason we ended up with this behavior is a bug
report from Shawn that was describing a similar but not quite the same
problem:

"[...]a bug in read-tree -m that prevents him from switching
branches when the type of a path changes between a directory and a
file.[...]"

That's not the same as when a now-tracked file clobbers a .gitignored
file. As far as I can tell (but may not have read carefully enough) that
wasn't a problem anyone reported, but was changed while fixing another
bug in c81935348b ("Fix switching to a branch with D/F when current
branch has file D.", 2007-03-15).


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Duy Nguyen
On Sun, Nov 11, 2018 at 1:33 PM Ævar Arnfjörð Bjarmason
 wrote:
> The users who need protection against git deleting their files the most
> are exactly the sort of users who aren't expert-level enough to
> understand the nuances of how the semantics of .gitignore and "precious"
> are going to interact before git eats their data.
>
> This is pretty apparent from the bug reports we're getting about
> this. None of them are:
>
> "Hey, I 100% understood .gitignore semantics including this one part
> of the docs where you say you'll do this, but just forgot one day
> and deleted my work. Can we get some more safety?"
>
> But rather (with some hyperbole for effect):
>
> "ZOMG git deleted my file! Is this a bug??"
>
> So I think we should have the inverse of this "precious"
> attribute". Just a change to the docs to say that .gitignore doesn't
> imply these eager deletion semantics on tree unpacking anymore, and if
> users want it back they can define a "garbage" attribute
> (s/precious/garbage/).
>
> That will lose no data, and in the very rare cases where a checkout of
> tracked files would overwrite an ignored pattern, we can just error out
> (as we do with the "Ok to overwrite" branch removed) and tell the user
> to delete the files to proceed.

There's also the other side of the coin. If this refuse to overwrite
triggers too often, it can become an annoyance. So far I've seen two
reports of accident overwriting which make me think turning precious
to trashable may be too extreme. Plus ignored files are trashable by
default (or at least by design so far), adding trashable attribute
changes how we handle ignored files quite significantly.
-- 
Duy


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy   writes:

> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?

FWIW, I am reasonably sure that I was the first in Git circle who
used the term "precious" in discussions wrt .gitignore, i.e. "Git
has ignored but not precious category".  Since it was not my
invention but was a borrowed term from tla (aka GNU arch), I'd
suggest to keep using that term, unless there is a strong reason not
to follow longstanding precedent.


Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Ævar Arnfjörð Bjarmason


On Sun, Nov 11 2018, Ævar Arnfjörð Bjarmason wrote:

> [CC-ing some of the people involved in recent threads about this]
>
> On Sun, Nov 11 2018, Nguyễn Thái Ngọc Duy wrote:
>
>> Since this topic has come up twice recently, I revisited this
>> "precious" thingy that I started four years ago and tried to see if I
>> could finally finish it. There are a couple things to be sorted out...
>>
>> A new attribute "precious" is added to indicate that certain files
>> have valuable content and should not be easily discarded even if they
>> are ignored or untracked (*).
>>
>> So far there are two parts of Git that are made aware of precious
>> files: "git clean" will leave precious files alone and unpack-trees.c
>> (i.e. merges and branch switches) will not overwrite
>> ignored-but-precious files.
>>
>> Is there any other parts of Git that should be made aware of this
>> "precious" attribute?
>>
>> Also while "precious" is a fun name, but it does not sound serious.
>> Any suggestions? Perhaps "valuable"?
>>
>> Very lightly tested. The patch is more to have something to discuss
>> than is bug free and ready to use.
>>
>> (*) Note that tracked files could be marked "precious" in the future
>> too although the exact semantics is not very clear since tracked
>> files are by default precious.
>>
>> But something like "index log" could use this to record all
>> changes to precious files instead of just "git add -p" changes,
>> for example. So these files are in a sense more precious than
>> other tracked files.
>>
>> Signed-off-by: Nguyễn Thái Ngọc Duy 
>> ---
>>  Documentation/git-clean.txt |  3 ++-
>>  Documentation/gitattributes.txt | 13 +
>>  attr.c  |  9 +
>>  attr.h  |  2 ++
>>  builtin/clean.c | 19 ---
>>  unpack-trees.c  |  3 ++-
>>  6 files changed, 44 insertions(+), 5 deletions(-)
>>
>> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
>> index 03056dad0d..a9beadfb12 100644
>> --- a/Documentation/git-clean.txt
>> +++ b/Documentation/git-clean.txt
>> @@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This 
>> can, for
>>  example, be useful to remove all build products.
>>
>>  If any optional `...` arguments are given, only those paths
>> -are affected.
>> +are affected. Ignored or untracked files with `precious` attributes
>> +are not removed.
>>
>>  OPTIONS
>>  ---
>> diff --git a/Documentation/gitattributes.txt 
>> b/Documentation/gitattributes.txt
>> index b8392fc330..c722479bdc 100644
>> --- a/Documentation/gitattributes.txt
>> +++ b/Documentation/gitattributes.txt
>> @@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, 
>> the value of the
>>  (See linkgit:git-config[1]).
>>
>>
>> +Precious files
>> +
>> +
>> +`precious`
>> +^^
>> +
>> +This attribute is set on files to indicate that their content is
>> +valuable. Many commands will behave slightly different on precious
>> +files. linkgit:git-clean[1] will leave precious files alone. Merging
>> +and branch switching will not silently overwrite ignored files that
>> +are marked "precious".
>> +
>> +
>>  USING MACRO ATTRIBUTES
>>  --
>>
>> diff --git a/attr.c b/attr.c
>> index 60d284796d..d06ca0ae4b 100644
>> --- a/attr.c
>> +++ b/attr.c
>> @@ -1186,3 +1186,12 @@ void attr_start(void)
>>  pthread_mutex_init(_vector.mutex, NULL);
>>  #endif
>>  }
>> +
>> +int is_precious_file(struct index_state *istate, const char *path)
>> +{
>> +static struct attr_check *check;
>> +if (!check)
>> +check = attr_check_initl("precious", NULL);
>> +git_check_attr(istate, path, check);
>> +return check && ATTR_TRUE(check->items[0].value);
>> +}
>
> If we merge two branches is this using the merged post-image of
> .gitattributes as a source?
>
>>  if (o->dir &&
>> -is_excluded(o->dir, o->src_index, name, ))
>> +is_excluded(o->dir, o->src_index, name, ) &&
>> +!is_precious_file(o->src_index, name))
>>  /*
>>   * ce->name is explicitly excluded, so it is Ok to
>>   * overwrite it.
>
> I wonder if instead we should just be reverting c81935348b ("Fix
> switching to a branch with D/F when current branch has file D.",
> 2007-03-15), which these days (haven't dug deeply) would just be this,
> right?:
>
>>diff --git a/unpack-trees.c b/unpack-trees.c
> index 7570df481b..b3efaddd4f 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -1894,13 +1894,6 @@ static int check_ok_to_remove(const char *name, 
> int len, int dtype,
>   if (ignore_case && icase_exists(o, name, len, st))
>   return 0;
>
> - if (o->dir &&
> - is_excluded(o->dir, o->src_index, name, ))
> - /*
> -  * ce->name is explicitly excluded, so it is Ok to
> -  * overwrite it.
> -  

Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Ævar Arnfjörð Bjarmason


[CC-ing some of the people involved in recent threads about this]

On Sun, Nov 11 2018, Nguyễn Thái Ngọc Duy wrote:

> Since this topic has come up twice recently, I revisited this
> "precious" thingy that I started four years ago and tried to see if I
> could finally finish it. There are a couple things to be sorted out...
>
> A new attribute "precious" is added to indicate that certain files
> have valuable content and should not be easily discarded even if they
> are ignored or untracked (*).
>
> So far there are two parts of Git that are made aware of precious
> files: "git clean" will leave precious files alone and unpack-trees.c
> (i.e. merges and branch switches) will not overwrite
> ignored-but-precious files.
>
> Is there any other parts of Git that should be made aware of this
> "precious" attribute?
>
> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?
>
> Very lightly tested. The patch is more to have something to discuss
> than is bug free and ready to use.
>
> (*) Note that tracked files could be marked "precious" in the future
> too although the exact semantics is not very clear since tracked
> files are by default precious.
>
> But something like "index log" could use this to record all
> changes to precious files instead of just "git add -p" changes,
> for example. So these files are in a sense more precious than
> other tracked files.
>
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  Documentation/git-clean.txt |  3 ++-
>  Documentation/gitattributes.txt | 13 +
>  attr.c  |  9 +
>  attr.h  |  2 ++
>  builtin/clean.c | 19 ---
>  unpack-trees.c  |  3 ++-
>  6 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/Documentation/git-clean.txt b/Documentation/git-clean.txt
> index 03056dad0d..a9beadfb12 100644
> --- a/Documentation/git-clean.txt
> +++ b/Documentation/git-clean.txt
> @@ -21,7 +21,8 @@ option is specified, ignored files are also removed. This 
> can, for
>  example, be useful to remove all build products.
>
>  If any optional `...` arguments are given, only those paths
> -are affected.
> +are affected. Ignored or untracked files with `precious` attributes
> +are not removed.
>
>  OPTIONS
>  ---
> diff --git a/Documentation/gitattributes.txt b/Documentation/gitattributes.txt
> index b8392fc330..c722479bdc 100644
> --- a/Documentation/gitattributes.txt
> +++ b/Documentation/gitattributes.txt
> @@ -1188,6 +1188,19 @@ If this attribute is not set or has an invalid value, 
> the value of the
>  (See linkgit:git-config[1]).
>
>
> +Precious files
> +
> +
> +`precious`
> +^^
> +
> +This attribute is set on files to indicate that their content is
> +valuable. Many commands will behave slightly different on precious
> +files. linkgit:git-clean[1] will leave precious files alone. Merging
> +and branch switching will not silently overwrite ignored files that
> +are marked "precious".
> +
> +
>  USING MACRO ATTRIBUTES
>  --
>
> diff --git a/attr.c b/attr.c
> index 60d284796d..d06ca0ae4b 100644
> --- a/attr.c
> +++ b/attr.c
> @@ -1186,3 +1186,12 @@ void attr_start(void)
>   pthread_mutex_init(_vector.mutex, NULL);
>  #endif
>  }
> +
> +int is_precious_file(struct index_state *istate, const char *path)
> +{
> + static struct attr_check *check;
> + if (!check)
> + check = attr_check_initl("precious", NULL);
> + git_check_attr(istate, path, check);
> + return check && ATTR_TRUE(check->items[0].value);
> +}

If we merge two branches is this using the merged post-image of
.gitattributes as a source?

>   if (o->dir &&
> - is_excluded(o->dir, o->src_index, name, ))
> + is_excluded(o->dir, o->src_index, name, ) &&
> + !is_precious_file(o->src_index, name))
>   /*
>* ce->name is explicitly excluded, so it is Ok to
>* overwrite it.

I wonder if instead we should just be reverting c81935348b ("Fix
switching to a branch with D/F when current branch has file D.",
2007-03-15), which these days (haven't dug deeply) would just be this,
right?:

>diff --git a/unpack-trees.c b/unpack-trees.c
index 7570df481b..b3efaddd4f 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1894,13 +1894,6 @@ static int check_ok_to_remove(const char *name, int 
len, int dtype,
if (ignore_case && icase_exists(o, name, len, st))
return 0;

-   if (o->dir &&
-   is_excluded(o->dir, o->src_index, name, ))
-   /*
-* ce->name is explicitly excluded, so it is Ok to
-* overwrite it.
-*/
-   return 0;
if (S_ISDIR(st->st_mode)) {
/*
 * We are checking out path "foo" and

Something like the approach you're 

Re: [RFC PATCH] Introduce "precious" file concept

2018-11-11 Thread Bert Wesarg
On Sun, Nov 11, 2018 at 10:53 AM Nguyễn Thái Ngọc Duy  wrote:
>
> Also while "precious" is a fun name, but it does not sound serious.
> Any suggestions? Perhaps "valuable"?

"precious" is also used by POSIX make:

http://pubs.opengroup.org/onlinepubs/9699919799/utilities/make.html

Bert