Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-15 Thread Jacob Keller
On Fri, Sep 15, 2017 at 10:21 AM, Kevin Willford  wrote:
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Thursday, September 14, 2017 11:00 PM
>>
>> Kevin Willford  writes:
>>
>> > 1. Does this statement, "I only care about the files in this
>> > sparse checkout, and do not concern me with anything else", mean
>> > that git should not change files outside the sparse-checkout whether
>> > that be in the working directory or in the index?  Or does that only
>> > apply to the working directory and the index version of files can
>> > change to whatever git the git command would do without using
>> > sparse?  For example if I am doing a 'git reset HEAD~1'  should the
>> > version in the index of files outside the sparse not be changed or
>> > change to the HEAD~1 version with the skip-worktree bit on?
>>
>> My understanding of the purpose of "sparse checkout" thing is that
>> the user still wants to create correct whole-tree commit even the
>> user does not have the whole-tree checkout.  The object names for
>> blobs recorded in the index that are meant to be included in the
>> next commit MUST be the same as those that would be in the index
>> when the "sparse" feature is not in use.  "reset HEAD~1" should
>> match the index entries to the tree entries in HEAD~1.  So, the
>> latter, I would think, among your two alternatives.
>>
>> IOW, after "git reset HEAD~", if you drop the skip-worktree bit from
>> all index entries, "git diff --cached HEAD" must say "there is no
>> changes".
>>
>> The only difference between the "sparse" and normal would be that,
>> because the "sparse" user does not intend to change anything outside
>> the "sparse" area, these paths outside her "sparse" area would not
>> materialize on the filesystem.  For the next "write-tree" out of the
>> index to still write the correct tree out, the entries outside her
>> "sparse" area in the index MUST match the tree of the commit she
>> started working from.
>>
>
> Makes sense.  And even though the reset might only change entries
> outside the sparse area and the next status will report "nothing to commit,
> working tree clean",  that's okay because the user hasn't changed
> anything in their sparse area and intended to roll back the index to
> whatever they specified in their reset command.
>
>> > 2. How will this work with other git commands like merge, rebase,
>> > cherry-pick, etc.?
>> > 3. What about when there is a merge conflict with a file that is outside
>> > the sparse checkout?
>>
>> I would say, rather than forbidding such a merge, it should let her
>> see and deal with the conflict by dropping the "this is outside the
>> sparse area, so do not bother materializing it to the filesystem"
>> bit, but tell her loudly what it did ("I checked out a half-merged
>> file outside your sparse-checkout area because you'd need it while
>> resolving the conflict").  By doing things that way, the user can
>> decide if she wants to go ahead and complete the merge, even if the
>> conflict is outside the area she is currently interested in, or
>> postpone the merge and continue working on what she has been working
>> on inside the narrowed-down area first.
>>
>> I do not have a strong opinion whether the sparse-checkout
>> configuration file should be adjusted to match when the command must
>> tentatively bust the sparse checkout area; I'd imagine it can be
>> argued either way.
>>
>> Note that "sparse" is not my itch, and I would not be surprised if
>> those who designed it may want it to work differently from my
>> knee-jerk reaction in the previous two paragraphs, and I may even
>> find such an alternative solution preferable.
>>
>> But it is highly unlikely for any sensible solution would violate
>> the basic premise, i.e. "the indexed contents will stay the same as
>> the case without any 'sparse', so the next write-tree will do the
>> right thing".
>
> There was one other case that I thought about while implementing
> this approach and it is when the user creates a file that is outside their
> sparse definition.  From your explanation above I will attempt to
> explain how I think it should work and please correct me if you see
> it working differently.
>
> The user creates a file that is outside the sparse area and it will show
> up as untracked.  No problem here since the untracked are outside the
> scope of using sparse.  Next the user adds the untracked file to the index.
> The skip-worktree bit should be off and stay off since the user could make
> additional changes and want to add them.  Once the user commits the
> newly created file, I could see turning on the skip-worktree bit and
> removing the file from the working tree after the commit but since
> this was a user initiated action and it is sparse "checkout" it makes
> sense to wait until the next "checkout" command which will use the
> sparse definition to clean this file from the working directory and
> turn on the skip-worktree 

RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-15 Thread Kevin Willford
From: Junio C Hamano [mailto:gits...@pobox.com]
Sent: Thursday, September 14, 2017 11:00 PM
> 
> Kevin Willford  writes:
> 
> > 1. Does this statement, "I only care about the files in this
> > sparse checkout, and do not concern me with anything else", mean
> > that git should not change files outside the sparse-checkout whether
> > that be in the working directory or in the index?  Or does that only
> > apply to the working directory and the index version of files can
> > change to whatever git the git command would do without using
> > sparse?  For example if I am doing a 'git reset HEAD~1'  should the
> > version in the index of files outside the sparse not be changed or
> > change to the HEAD~1 version with the skip-worktree bit on?
> 
> My understanding of the purpose of "sparse checkout" thing is that
> the user still wants to create correct whole-tree commit even the
> user does not have the whole-tree checkout.  The object names for
> blobs recorded in the index that are meant to be included in the
> next commit MUST be the same as those that would be in the index
> when the "sparse" feature is not in use.  "reset HEAD~1" should
> match the index entries to the tree entries in HEAD~1.  So, the
> latter, I would think, among your two alternatives.
> 
> IOW, after "git reset HEAD~", if you drop the skip-worktree bit from
> all index entries, "git diff --cached HEAD" must say "there is no
> changes".
> 
> The only difference between the "sparse" and normal would be that,
> because the "sparse" user does not intend to change anything outside
> the "sparse" area, these paths outside her "sparse" area would not
> materialize on the filesystem.  For the next "write-tree" out of the
> index to still write the correct tree out, the entries outside her
> "sparse" area in the index MUST match the tree of the commit she
> started working from.
> 

Makes sense.  And even though the reset might only change entries
outside the sparse area and the next status will report "nothing to commit,
working tree clean",  that's okay because the user hasn't changed
anything in their sparse area and intended to roll back the index to
whatever they specified in their reset command.

> > 2. How will this work with other git commands like merge, rebase,
> > cherry-pick, etc.?
> > 3. What about when there is a merge conflict with a file that is outside
> > the sparse checkout?
> 
> I would say, rather than forbidding such a merge, it should let her
> see and deal with the conflict by dropping the "this is outside the
> sparse area, so do not bother materializing it to the filesystem"
> bit, but tell her loudly what it did ("I checked out a half-merged
> file outside your sparse-checkout area because you'd need it while
> resolving the conflict").  By doing things that way, the user can
> decide if she wants to go ahead and complete the merge, even if the
> conflict is outside the area she is currently interested in, or
> postpone the merge and continue working on what she has been working
> on inside the narrowed-down area first.
> 
> I do not have a strong opinion whether the sparse-checkout
> configuration file should be adjusted to match when the command must
> tentatively bust the sparse checkout area; I'd imagine it can be
> argued either way.
> 
> Note that "sparse" is not my itch, and I would not be surprised if
> those who designed it may want it to work differently from my
> knee-jerk reaction in the previous two paragraphs, and I may even
> find such an alternative solution preferable.
> 
> But it is highly unlikely for any sensible solution would violate
> the basic premise, i.e. "the indexed contents will stay the same as
> the case without any 'sparse', so the next write-tree will do the
> right thing".

There was one other case that I thought about while implementing
this approach and it is when the user creates a file that is outside their
sparse definition.  From your explanation above I will attempt to
explain how I think it should work and please correct me if you see
it working differently.

The user creates a file that is outside the sparse area and it will show
up as untracked.  No problem here since the untracked are outside the
scope of using sparse.  Next the user adds the untracked file to the index.
The skip-worktree bit should be off and stay off since the user could make
additional changes and want to add them.  Once the user commits the
newly created file, I could see turning on the skip-worktree bit and
removing the file from the working tree after the commit but since
this was a user initiated action and it is sparse "checkout" it makes
sense to wait until the next "checkout" command which will use the
sparse definition to clean this file from the working directory and
turn on the skip-worktree flag.  If the user wants to continue to use
the new file, they would need to update their sparse-checkout file
to include the newly created file so that the next checkout will not

Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-14 Thread Junio C Hamano
Kevin Willford  writes:

> 1. Does this statement, "I only care about the files in this
> sparse checkout, and do not concern me with anything else", mean
> that git should not change files outside the sparse-checkout whether
> that be in the working directory or in the index?  Or does that only
> apply to the working directory and the index version of files can
> change to whatever git the git command would do without using
> sparse?  For example if I am doing a 'git reset HEAD~1'  should the
> version in the index of files outside the sparse not be changed or
> change to the HEAD~1 version with the skip-worktree bit on?

My understanding of the purpose of "sparse checkout" thing is that
the user still wants to create correct whole-tree commit even the
user does not have the whole-tree checkout.  The object names for
blobs recorded in the index that are meant to be included in the
next commit MUST be the same as those that would be in the index
when the "sparse" feature is not in use.  "reset HEAD~1" should
match the index entries to the tree entries in HEAD~1.  So, the
latter, I would think, among your two alternatives.

IOW, after "git reset HEAD~", if you drop the skip-worktree bit from
all index entries, "git diff --cached HEAD" must say "there is no
changes".

The only difference between the "sparse" and normal would be that,
because the "sparse" user does not intend to change anything outside
the "sparse" area, these paths outside her "sparse" area would not
materialize on the filesystem.  For the next "write-tree" out of the
index to still write the correct tree out, the entries outside her
"sparse" area in the index MUST match the tree of the commit she
started working from.

> 2. How will this work with other git commands like merge, rebase,
> cherry-pick, etc.?  
> 3. What about when there is a merge conflict with a file that is outside
> the sparse checkout?

I would say, rather than forbidding such a merge, it should let her
see and deal with the conflict by dropping the "this is outside the
sparse area, so do not bother materializing it to the filesystem"
bit, but tell her loudly what it did ("I checked out a half-merged
file outside your sparse-checkout area because you'd need it while
resolving the conflict").  By doing things that way, the user can
decide if she wants to go ahead and complete the merge, even if the
conflict is outside the area she is currently interested in, or
postpone the merge and continue working on what she has been working
on inside the narrowed-down area first.

I do not have a strong opinion whether the sparse-checkout
configuration file should be adjusted to match when the command must
tentatively bust the sparse checkout area; I'd imagine it can be
argued either way.

Note that "sparse" is not my itch, and I would not be surprised if
those who designed it may want it to work differently from my
knee-jerk reaction in the previous two paragraphs, and I may even
find such an alternative solution preferable.  

But it is highly unlikely for any sensible solution would violate
the basic premise, i.e. "the indexed contents will stay the same as
the case without any 'sparse', so the next write-tree will do the
right thing".


RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-14 Thread Kevin Willford
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Wednesday, September 13, 2017 4:18 PM
> 
> Jacob Keller  writes:
> 
> > By definition if you do a sparse checkout, you're telling git "I only
> > care about the files in this sparse checkout, and do not concern me
> > with anything else"... So the proposed fix is "since git cleared the
> > skip-worktree flag, we should actually also copy the file out again."
> > but I think the real problem is that we're clearing skip worktree to
> > begin with?
> 
> As this 100% agree with what I've been saying, I do not have
> anything to add to the discussion at the moment, so I'll stop
> speaking now but will continue to monitor the thread so that I may
> hear a new argument and datapoint that would make me change my mind.
> 
> Thanks for a healthy discussion.

Hi Junio,

Thanks for the feedback.  I will give this implementation a try.  I still
have the following unanswered questions, if you wouldn't mind
answering.

1. Does this statement, "I only care about the files in this
sparse checkout, and do not concern me with anything else", mean
that git should not change files outside the sparse-checkout whether
that be in the working directory or in the index?  Or does that only
apply to the working directory and the index version of files can
change to whatever git the git command would do without using
sparse?  For example if I am doing a 'git reset HEAD~1'  should the
version in the index of files outside the sparse not be changed or
change to the HEAD~1 version with the skip-worktree bit on?
2. How will this work with other git commands like merge, rebase,
cherry-pick, etc.?  
3. What about when there is a merge conflict with a file that is outside
the sparse checkout?  Should there not be a conflict because
git shouldn't be trying to merge the file since it is outside the sparse
checkout area?  Should the conflicted file get written outside the
sparse checkout so the user can resolve it?

Thanks,
Kevin



Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-13 Thread Junio C Hamano
Jacob Keller  writes:

> By definition if you do a sparse checkout, you're telling git "I only
> care about the files in this sparse checkout, and do not concern me
> with anything else"... So the proposed fix is "since git cleared the
> skip-worktree flag, we should actually also copy the file out again."
> but I think the real problem is that we're clearing skip worktree to
> begin with?

As this 100% agree with what I've been saying, I do not have
anything to add to the discussion at the moment, so I'll stop
speaking now but will continue to monitor the thread so that I may
hear a new argument and datapoint that would make me change my mind.

Thanks for a healthy discussion.


RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-13 Thread Kevin Willford
> From: Jacob Keller [mailto:jacob.kel...@gmail.com]
> Sent: Tuesday, September 12, 2017 7:39 PM
> 
> On Tue, Sep 12, 2017 at 4:30 PM, Kevin Willford  wrote:
> >
> > Sorry.  It was not in the sparse-checkout file.
> > $ git config core.sparsecheckout true
> > $ echo /i > .git/info/sparse-checkout
> > $ git checkout -f
> > was ran in the modified file case as well and "i" was the only file in the
> > working directory before reset.
> >
> 
> 
> I'm assuming then that you mean that some file "m" is modified by the
> commit, and do not mean to say that it has local modifications in the
> working tree? That is not what I had inferred from earlier, so I was
> very much confused.
> 

Yes

> In this example, the only file actually checked out is "i", as
> everything else will have the skip-worktree bit set..?
> 

Yes

> So doing git reset HEAD~1 will reset the branch back one commit, and
> now because of this reset is clearing the skip worktree flag, and
> since you never had a copy of it checked out it is notified as
> deleted, because it's no longer marked as skip-worktree?
> 

Correct

> 
> I think the real fix is to stop having reset clear skip-worktree, no?
> 
> By definition if you do a sparse checkout, you're telling git "I only
> care about the files in this sparse checkout, and do not concern me
> with anything else"... So the proposed fix is "since git cleared the
> skip-worktree flag, we should actually also copy the file out again."
> but I think the real problem is that we're clearing skip worktree to
> begin with?

This certainly is an option but I would have some questions with this
approach.  Does this statement, "I only care about the files in this
sparse checkout, and do not concern me with anything else", mean
that git should not change files outside the sparse-checkout whether
they are in the working directory or the index?  Or does that only
apply to the working directory and the index version of files can
change to whatever git decides?  So in the example above would
"m" be the HEAD~1 version of the file in the index or the HEAD
version before the reset?  Does this apply to all git commands,
merge, rebase, cherry-pick, etc?  What about when there is a merge
conflict with a file that is outside the sparse checkout?

Seems to me it is a lot more complex than only caring about the files
in the sparse checkout and no concern for anything else.  Personally
I would like to error on the side of letting the user decide what they
want to do, even if that means turning off the skip-worktree bit and
putting the working directory into an expected state.




Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-12 Thread Jacob Keller
On Tue, Sep 12, 2017 at 4:30 PM, Kevin Willford  wrote:
>
> Sorry.  It was not in the sparse-checkout file.
> $ git config core.sparsecheckout true
> $ echo /i > .git/info/sparse-checkout
> $ git checkout -f
> was ran in the modified file case as well and "i" was the only file in the
> working directory before reset.
>


I'm assuming then that you mean that some file "m" is modified by the
commit, and do not mean to say that it has local modifications in the
working tree? That is not what I had inferred from earlier, so I was
very much confused.

In this example, the only file actually checked out is "i", as
everything else will have the skip-worktree bit set..?

So doing git reset HEAD~1 will reset the branch back one commit, and
now because of this reset is clearing the skip worktree flag, and
since you never had a copy of it checked out it is notified as
deleted, because it's no longer marked as skip-worktree?


I think the real fix is to stop having reset clear skip-worktree, no?

By definition if you do a sparse checkout, you're telling git "I only
care about the files in this sparse checkout, and do not concern me
with anything else"... So the proposed fix is "since git cleared the
skip-worktree flag, we should actually also copy the file out again."
but I think the real problem is that we're clearing skip worktree to
begin with?


RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-12 Thread Kevin Willford
> From: Jacob Keller [mailto:jacob.kel...@gmail.com]
> Sent: Tuesday, September 12, 2017 4:29 PM
> 
> On Tue, Sep 12, 2017 at 1:20 PM, Kevin Willford  wrote:
> >
> > I think this is where I need to do a better job of explaining so here is a
> > simple example.
> >
> > I have a file "a" that was added in the latest commit.
> > $ git log --oneline
> > c1fa646 (HEAD -> reset, master) add file a
> > 40b342c Initial commit with file i
> >
> > Running the reset without using a sparse-checkout file
> >
> > $ git reset HEAD~1
> > $ git status
> > On branch reset
> > Untracked files:
> >   (use "git add ..." to include in what will be committed)
> >
> > a
> >
> > nothing added to commit but untracked files present (use "git add" to track)
> >
> > Turning on sparse-checkout and running checkout to make my working
> > directory sparse
> >
> > $ git config core.sparsecheckout true
> > $ echo /i > .git/info/sparse-checkout
> > $ git checkout -f
> >
> > Running reset gives me
> > $ git reset HEAD~1
> > $ git status
> > On branch reset
> > nothing to commit, working tree clean
> > $ git ls-files
> > i
> >
> > file a is gone.  Not in the index and not in the working directory.
> > Nothing to let the user know that anything changed.
> >
> > With a modified file no sparse-checkout
> > $ git log --oneline
> > 6fbd34a (HEAD -> reset, modified) modified file m
> > c734d72 Initial commit with file i and m
> > $ git reset HEAD~1
> > Unstaged changes after reset:
> > M   m
> > $ git status
> > On branch reset
> > Changes not staged for commit:
> >   (use "git add ..." to update what will be committed)
> >   (use "git checkout -- ..." to discard changes in working directory)
> >
> > modified:   m
> >
> > no changes added to commit (use "git add" and/or "git commit -a")
> >
> > With sparse-checkout
> > $ git reset HEAD~1
> > Unstaged changes after reset:
> > D   m
> > $ git status
> > On branch reset
> > Changes not staged for commit:
> >   (use "git add/rm ..." to update what will be committed)
> >   (use "git checkout -- ..." to discard changes in working directory)
> >
> > deleted:m
> >
> > no changes added to commit (use "git add" and/or "git commit -a")
> >
> 
> Wasn't "m" outside the sparse checkout? Or was it a file in the sparse
> checkout? I mean to say, the file after setting up sparse checkout was
> one of the "interesting" files that sparse checked out?
> 
> Or was it in fact a separate file which wasn't there?

Sorry.  It was not in the sparse-checkout file. 
$ git config core.sparsecheckout true
$ echo /i > .git/info/sparse-checkout
$ git checkout -f
was ran in the modified file case as well and "i" was the only file in the
working directory before reset. 

> 
> I would think that in sparse-checkout world, you should only *ever*
> have the files you list in sparse.
> 
> So files outside sparse world should be ignored, not shown and not
> show up in status, but they should absolutely not show up in the
> working tree either.

Sparse checkout uses the skip-worktree bit which has the following
documentation:
https://git-scm.com/docs/git-update-index 
"Skip-worktree bit can be defined in one (long) sentence: When reading
an entry, if it is marked as skip-worktree, then Git pretends its working
directory version is up to date and read the index version instead.

To elaborate, "reading" means checking for file existence, reading file
attributes or file content. The working directory version may be present
or absent. If present, its content may match against the index version or
not. Writing is not affected by this bit, content safety is still first 
priority.
Note that Git can update working directory file, that is marked skip-worktree,
if it is safe to do so (i.e. working directory version matches index version)"

I'm not saying that is the right behavior, just pointing out the documentation.
And from my experience git will turn on and off the skip-worktree flag
depending on the command as we see happening with reset.

> 
> You're not "changing" any commits, because the status of the file at
> HEAD~1 is exactly what HEAD~1 says it is, but you just don't have a
> checked out copy of it.

In the case of reset yes and it matches HEAD and if the sparse flag was on,
the user does not know that the file was changed in any way, which if I add
and commit will create a commit with the file content at HEAD~1
that is outside the sparse area without the user even knowing what
those changes were unless after the commit they run a git diff HEAD~1 and
see.  So files will be changed outside of the sparse area without the
user's knowledge.

> 
> I think the key problem is that reset is clearing the sparse flag of a
> file so that it no longer shows up as sparse, which is why status
> subsequently shows the file deleted (since you don't have a local copy
> anymore).

And in the case of an added file there is not a sparse flag to clear or keep
Because the file didn't exist at HEAD~1 

Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-12 Thread Jacob Keller
On Tue, Sep 12, 2017 at 1:20 PM, Kevin Willford  wrote:
>> From: Junio C Hamano [mailto:gits...@pobox.com]
>> Sent: Monday, September 11, 2017 9:57 PM
>>
>> Let's imagine a path P that is outside the sparse checkout area.
>> And we check out a commit that has P.  P would still be recorded in
>> the index but it would not materialize in the working tree.  "git
>> status" and friends are asked not to treat this as "locally removed",
>> to prevent "commit -a" from recording a removal, of course.
>>
>> Now, let's further imagine that you have a checkout of the same
>> project but at a commit that does not have P.  Then you reset to
>> another commit that does have P.  My understanding of what Kevin's
>> first test wants to demonstrate is that the index is populated with
>> P (because you did reset to a commit with that path) but it does not
>> materialize in the working tree (perhaps because that is outside the
>> sparse checkout area?), yet there is something missing compared to
>> the earlier case where "git status" and friends are asked not to
>> treat P as "locally removed".  They instead show P as locally removed,
>> and "commit -a" would record a removal---that is indeed a problem.
>>
>> Am I reading the problem description correctly so far?  If so, then
>> my answer to my first question (are we solving a right problem?) is
>> yes.
>>
>
> I think this is where I need to do a better job of explaining so here is a
> simple example.
>
> I have a file "a" that was added in the latest commit.
> $ git log --oneline
> c1fa646 (HEAD -> reset, master) add file a
> 40b342c Initial commit with file i
>
> Running the reset without using a sparse-checkout file
>
> $ git reset HEAD~1
> $ git status
> On branch reset
> Untracked files:
>   (use "git add ..." to include in what will be committed)
>
> a
>
> nothing added to commit but untracked files present (use "git add" to track)
>
> Turning on sparse-checkout and running checkout to make my working
> directory sparse
>
> $ git config core.sparsecheckout true
> $ echo /i > .git/info/sparse-checkout
> $ git checkout -f
>
> Running reset gives me
> $ git reset HEAD~1
> $ git status
> On branch reset
> nothing to commit, working tree clean
> $ git ls-files
> i
>
> file a is gone.  Not in the index and not in the working directory.
> Nothing to let the user know that anything changed.
>
> With a modified file no sparse-checkout
> $ git log --oneline
> 6fbd34a (HEAD -> reset, modified) modified file m
> c734d72 Initial commit with file i and m
> $ git reset HEAD~1
> Unstaged changes after reset:
> M   m
> $ git status
> On branch reset
> Changes not staged for commit:
>   (use "git add ..." to update what will be committed)
>   (use "git checkout -- ..." to discard changes in working directory)
>
> modified:   m
>
> no changes added to commit (use "git add" and/or "git commit -a")
>
> With sparse-checkout
> $ git reset HEAD~1
> Unstaged changes after reset:
> D   m
> $ git status
> On branch reset
> Changes not staged for commit:
>   (use "git add/rm ..." to update what will be committed)
>   (use "git checkout -- ..." to discard changes in working directory)
>
> deleted:m
>
> no changes added to commit (use "git add" and/or "git commit -a")
>

Wasn't "m" outside the sparse checkout? Or was it a file in the sparse
checkout? I mean to say, the file after setting up sparse checkout was
one of the "interesting" files that sparse checked out?

Or was it in fact a separate file which wasn't there?

I would think that in sparse-checkout world, you should only *ever*
have the files you list in sparse.

So files outside sparse world should be ignored, not shown and not
show up in status, but they should absolutely not show up in the
working tree either.

You're not "changing" any commits, because the status of the file at
HEAD~1 is exactly what HEAD~1 says it is, but you just don't have a
checked out copy of it.

I think the key problem is that reset is clearing the sparse flag of a
file so that it no longer shows up as sparse, which is why status
subsequently shows the file deleted (since you don't have a local copy
anymore).

Am I understanding the problem correctly? I think your examples above
are not clear because they don't seem to be each complete individually
(The sparse checkout examples should start from a clean world and
explain how they got there rather than relying on imformation in the
prior non sparse example. We should be clear so everyone knows what
the problem is).

If you're performing a sparse checkout and you modify files outside of
the sparse area, I think that will cause problems, and we may not be
making the best efforts to resolve it. I do agree that if you have
created a file "m" when only "i" is in your path, that we should
absolutely not delete "m" but leave it as is.

Thanks,
Jake

> I think we can agree that this is not the correct behavior.
>
>> But this time, I am trying to see if the 

RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-12 Thread Kevin Willford
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Monday, September 11, 2017 9:57 PM
> 
> Let's imagine a path P that is outside the sparse checkout area.
> And we check out a commit that has P.  P would still be recorded in
> the index but it would not materialize in the working tree.  "git
> status" and friends are asked not to treat this as "locally removed",
> to prevent "commit -a" from recording a removal, of course.
> 
> Now, let's further imagine that you have a checkout of the same
> project but at a commit that does not have P.  Then you reset to
> another commit that does have P.  My understanding of what Kevin's
> first test wants to demonstrate is that the index is populated with
> P (because you did reset to a commit with that path) but it does not
> materialize in the working tree (perhaps because that is outside the
> sparse checkout area?), yet there is something missing compared to
> the earlier case where "git status" and friends are asked not to
> treat P as "locally removed".  They instead show P as locally removed,
> and "commit -a" would record a removal---that is indeed a problem.
> 
> Am I reading the problem description correctly so far?  If so, then
> my answer to my first question (are we solving a right problem?) is
> yes.
> 

I think this is where I need to do a better job of explaining so here is a
simple example.

I have a file "a" that was added in the latest commit.  
$ git log --oneline
c1fa646 (HEAD -> reset, master) add file a
40b342c Initial commit with file i

Running the reset without using a sparse-checkout file

$ git reset HEAD~1
$ git status
On branch reset
Untracked files:
  (use "git add ..." to include in what will be committed)

a

nothing added to commit but untracked files present (use "git add" to track)

Turning on sparse-checkout and running checkout to make my working
directory sparse

$ git config core.sparsecheckout true
$ echo /i > .git/info/sparse-checkout
$ git checkout -f

Running reset gives me 
$ git reset HEAD~1
$ git status
On branch reset
nothing to commit, working tree clean
$ git ls-files
i

file a is gone.  Not in the index and not in the working directory.
Nothing to let the user know that anything changed.

With a modified file no sparse-checkout
$ git log --oneline
6fbd34a (HEAD -> reset, modified) modified file m
c734d72 Initial commit with file i and m
$ git reset HEAD~1
Unstaged changes after reset:
M   m
$ git status
On branch reset
Changes not staged for commit:
  (use "git add ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

modified:   m

no changes added to commit (use "git add" and/or "git commit -a")

With sparse-checkout
$ git reset HEAD~1
Unstaged changes after reset:
D   m
$ git status
On branch reset
Changes not staged for commit:
  (use "git add/rm ..." to update what will be committed)
  (use "git checkout -- ..." to discard changes in working directory)

deleted:m

no changes added to commit (use "git add" and/or "git commit -a")

I think we can agree that this is not the correct behavior.

> But this time, I am trying to see if the approach is good.  I am not
> so sure if the approach taken by this patch is so obviously good as
> you seem to think.  A logical consequence of the approach "git
> status thinks that P appears in the index and missing in the working
> tree is a local removal, so let's squelch it by creating the file in
> the working tree" is that we will end up fully populating the
> working tree with paths that are clearly outside the area the user
> designated as the sparse checkout area---surely that may squelch
> "status", but to me, it looks like it is breaking what the user
> wanted to achieve with "sparse checkout" in the first place.
> 

I don't think that we are trying to "squelch" status so much as make
it consistent with what the user would expect to happen.  If that means
not resetting entries with the skip-worktree bit or resetting the entries
but keeping the skip-worktree bit on, okay, but I would argue that is
not what the user wants because if you are now saying that sparse
means git will not change files outside the sparse-checkout entries,
what about merge, rebase, cherry-pick, apply?  Should those only
change the files that are in the sparse definition?  If so we would
be changing the commits from the original, i.e.  cherry-pick 123 would
create a different commit depending on whether or not you are using
sparse as well as a different commit depending on what is in your
sparse-checkout.  

I see reset being a similar scenario in that if everything is clean, after I
"reset HEAD~1" I should be able to run "add ." + "commit" and have
the same commit as before the reset.  If this is changed to only reset
the sparse entries, there will be staged changes after the reset because
HEAD has changed but we didn't update the index versions of the files.
If we do update the index with the "HEAD~1" 

Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-11 Thread Junio C Hamano
Johannes Schindelin  writes:

>> Stepping back a bit, I am not sure if it is sane or even valid for the
>> end-user to modify paths outside sparse-checkout area, but that is
>> probably a separate tangent.
>
> That is not at all the scenario that Kevin fixed.

I know, but in this tangent I was reacting to this part of Kevin's
message

>>  create $path
>>  git add $path
>>  git commit
>>  git checkout // where $path is not in the sparse-checkout
>>  git reset HEAD~1

to which I was responding to.

> ...
> Yet without Kevin's fix, `git status` would report that the user *deleted
> files outside the sparse checkout*.
> ...
> That is such an obvious bug, and Kevin's fix is such an obvious
> improvement over the current upstream Git version, that I would think the
> only thing worth discussing is whether the patch goes about it in a way of
> which you approve.

When reviewing any topic, I'd ask three (or four) questions to
myself:

 * Are we solving a right problem?  Is the problem addressed valid?
 * Are we solving it with a right approach?
 * Does the patch implement the approach correctly?

(the fourth is s/correct/efficient/ of the third, which is optional).

and any "no" to an earlier question will make it a moot point to ask
further questions.

Let's imagine a path P that is outside the sparse checkout area.
And we check out a commit that has P.  P would still be recorded in
the index but it would not materialize in the working tree.  "git
status" and friends are asked not to treat this as "locally removed",
to prevent "commit -a" from recording a removal, of course.

Now, let's further imagine that you have a checkout of the same
project but at a commit that does not have P.  Then you reset to
another commit that does have P.  My understanding of what Kevin's
first test wants to demonstrate is that the index is populated with
P (because you did reset to a commit with that path) but it does not
materialize in the working tree (perhaps because that is outside the
sparse checkout area?), yet there is something missing compared to
the earlier case where "git status" and friends are asked not to
treat P as "locally removed".  They instead show P as locally removed,
and "commit -a" would record a removal---that is indeed a problem.

Am I reading the problem description correctly so far?  If so, then
my answer to my first question (are we solving a right problem?) is
yes.

My response to an earlier version of the patch was written while
assuming (read: without thinking deeply but trusting that the
contributor thought things through) that the first two answers were
"yes".  If the right approach were to check out what is in the index
to silence "git status" and friends by matching the index and the
working tree, then the implementation in the patch (i.e. setting up
some cache entry and calling checkout_entry() to make it materialize
in the working tree) looked correct, and that is what I meant by
"Other than that, the patch looks quite cleanly done and well
explained."

But this time, I am trying to see if the approach is good.  I am not
so sure if the approach taken by this patch is so obviously good as
you seem to think.  A logical consequence of the approach "git
status thinks that P appears in the index and missing in the working
tree is a local removal, so let's squelch it by creating the file in
the working tree" is that we will end up fully populating the
working tree with paths that are clearly outside the area the user
designated as the sparse checkout area---surely that may squelch
"status", but to me, it looks like it is breaking what the user
wanted to achieve with "sparse checkout" in the first place.

When we make a sparse checkout that places P outside the checked-out
area, with P in the index and not in the working tree, what asks
"git status" and friends not to treat P as "locally removed"?
Shouldn't "reset HEAD~1" that resurrected P that was missing in the
commit in the state before you did the "reset" be doing the same
(e.g. flipping the bit in the index for path P that says "not having
this in the working tree is not a removal---it is deliberately not
checked out")?  If that is the right approach to solve the issue
(which we established is a right problem to solve already), and the
approach that the patch wants to take is quite the opposite of it,
then my answer to the second question (are we solving the problem
with the right approach?) is no.  And at that point, it is moot to
ask if the code correctly and/or efficiently implements that wrong
solution, isn't it?


Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-11 Thread Johannes Schindelin
Hi Junio,

On Mon, 11 Sep 2017, Junio C Hamano wrote:

> Stepping back a bit, I am not sure if it is sane or even valid for the
> end-user to modify paths outside sparse-checkout area, but that is
> probably a separate tangent.

That is not at all the scenario that Kevin fixed. Just have a quick look
at the regression test: in a sparse checkout, the user checked out a
branch, then called `reset` to switch to a different commit. No file was
touched by the user outside the sparse checkout.

Yet without Kevin's fix, `git status` would report that the user *deleted
files outside the sparse checkout*.

That is such an obvious bug, and Kevin's fix is such an obvious
improvement over the current upstream Git version, that I would think the
only thing worth discussing is whether the patch goes about it in a way of
which you approve.

For example, you mentioned that you would want to move the declaration of
`two` and `was_missing` into the conditional code block. That is a valid
suggestion for `was_missing` (but of course not for `two`, which is used
in the condition of the code block). That suggestion is more about code
style (and of course easily fixed by Kevin using Edit>Refactor>Move
Definition Location in VS), though, than about the correctness of the post
image.

Much more interesting would be a review of the conditional code block. And
I am not talking about the camelCasing of `ceBefore` (which will be fixed
as easily by Edit>Refactor>Rename). I am talking about the stuff where
tools cannot help, but where your experience is necessary: is it correct
to use make_cache_entry()/checkout_entry() in this case? Are the
parameters passed to those functions correct? Is the call to
cache_name_pos() followed by ce_skip_worktree() the best way to find out
whether the file that is absent was not actually deleted by the user, or
is there a less CPU-intensive way, seeing as we are already guaranteed to
iterate over the queue diff in alphabetical order?

I understand that those latter questions are a lot harder to answer, sorry
about that.

Ciao,
Dscho


Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-10 Thread Junio C Hamano
Kevin Willford  writes:

> I agree with this when you are not dealing with a sparse-checkout.
> When using a sparse-checkout I expect git not to touch things
> outside of what I have specified in my sparse-checkout file.  If it
> does, it should let me know or put my working directory in a
> state that is expected.  Especially when it is changing the
> skip-worktree bits causing files outside the sparse-checkout to be
> reported incorrectly by status.

Well, whether using a sparse-checkout or not, I would expect Git not
to touch *any* filesystem entity when "git reset" (not "--hard",
just "git reset []") is given, whether the path is inside
or outside the sparse-checkout area.

Stepping back a bit, I am not sure if it is sane or even valid for
the end-user to modify paths outside sparse-checkout area, but that
is probably a separate tangent.




RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Kevin Willford
> From: Junio C Hamano [mailto:gits...@pobox.com]
> Sent: Friday, September 8, 2017 9:18 PM
> 
> Kevin Willford  writes:
> 
> > 1. reset mixed when there were files that were added
> >
> > In this case the index will no longer have the entry at all because
> > the reset is making the index look like before the file was added
> > which didn't have it. When not using the sparse-checkout this is fine
> > because the file is in the working directory and the reset didn't touch
> > it.  But using the sparse-checkout on that file and if the file was not
> > in the working directory, the index gets reset and the entry for that
> > file is gone and if we don't put the index version of the file before the
> > reset into the working directory, then we have lost the content for
> > that file
> 
> I do not quite understand this argument.  If you do
> 
>   edit $path
>   git add $path
>   rm $path
>   git reset
> 
> for a $path that is not involved in the sparse thing, the version
> that was previously indexed will be lost, but that is fine---the
> user said that version is expendable by saying "reset".
> 
> How would that be different when the $path were not to be
> materialized in the working tree due to sparseness?  Where did that
> "blob" object in the index immediately before you called "reset"
> came from, and why do you say that the user does *not* consider that
> one expendable, unlike the case for non-sparse path example above?
> 

I guess that I should have said files that were newly added, meaning they
are new files that were created and added in the previous commit.
 
I think that the difference is that the user explicitly removed the file. When
using sparse it is git that is causing the removal of the file.  For example
if I have /file in my spare-checkout file so that I am only working on the one
file.  The previous commit had new file2 added and I run a git reset HEAD~1.
I as the user do not expect that file2 just disappear, but yet that is what
happens.  So from you example above if I do.

create $path
git add $path
git commit
git checkout // where $path is not in the sparse-checkout
git reset HEAD~1

$path will be gone yet I the user did not remove it.  I guess you could
argue that the user did when they specified their sparse-checkout and
ran checkout but I wouldn't know what would go missing unless I ran
a diff before the reset to see.  There could have been X number of
files created and added in the previous commit(s) and status after the
reset would not report them and they are gone.  So I could clone,
setup sparse-checkout, checkout, reset HEAD~X and possibly lose
data I didn't expect to.

In the modified case where the previous commits have modifications
to files outside the sparse-checkout at least the status after the reset reports
the file as deleted so the user sees that something has happened to it.

I suppose the entry could stay in the index with the skip-worktree bit
on and not removed like it is now so that a git reset will only apply
to the entries in the sparse-checkout?  That seems like it would be
changing the meaning of reset.

> I suspect that a similar reasoning would apply to your 2., but I
> didn't think it through.
> 
> The possible misconception, which I perceive in both of these, is
> that you are somehow disagreeing with this basic assumption: by
> saying "git reset []", the user is telling us that the
> version in the index, even if that is different from HEAD,
> , or the file in the working tree, is *unwanted* and be
> replaced with the one in HEAD (or  when given).  Touching
> the working tree files upon "git reset" is the last thing the user
> expects to happen.
> 

I agree with this when you are not dealing with a sparse-checkout.
When using a sparse-checkout I expect git not to touch things
outside of what I have specified in my sparse-checkout file.  If it
does, it should let me know or put my working directory in a
state that is expected.  Especially when it is changing the
skip-worktree bits causing files outside the sparse-checkout to be
reported incorrectly by status.

 



Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Junio C Hamano
Kevin Willford  writes:

> 1. reset mixed when there were files that were added
>
> In this case the index will no longer have the entry at all because
> the reset is making the index look like before the file was added
> which didn't have it. When not using the sparse-checkout this is fine
> because the file is in the working directory and the reset didn't touch
> it.  But using the sparse-checkout on that file and if the file was not
> in the working directory, the index gets reset and the entry for that
> file is gone and if we don't put the index version of the file before the
> reset into the working directory, then we have lost the content for
> that file

I do not quite understand this argument.  If you do

edit $path
git add $path
rm $path
git reset

for a $path that is not involved in the sparse thing, the version
that was previously indexed will be lost, but that is fine---the
user said that version is expendable by saying "reset".

How would that be different when the $path were not to be
materialized in the working tree due to sparseness?  Where did that
"blob" object in the index immediately before you called "reset"
came from, and why do you say that the user does *not* consider that
one expendable, unlike the case for non-sparse path example above?

I suspect that a similar reasoning would apply to your 2., but I
didn't think it through.  

The possible misconception, which I perceive in both of these, is
that you are somehow disagreeing with this basic assumption: by
saying "git reset []", the user is telling us that the
version in the index, even if that is different from HEAD,
, or the file in the working tree, is *unwanted* and be
replaced with the one in HEAD (or  when given).  Touching
the working tree files upon "git reset" is the last thing the user
expects to happen.




RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Kevin Willford
From: Junio C Hamano [mailto:gits...@pobox.com]
Sent: Friday, September 8, 2017 1:02 PM
> Kevin Willford  writes:
> 
> > diff --git a/builtin/reset.c b/builtin/reset.c
> > index d72c7d1c96..1b8bb45989 100644
> > --- a/builtin/reset.c
> > +++ b/builtin/reset.c
> > @@ -24,6 +24,7 @@
> >  #include "cache-tree.h"
> >  #include "submodule.h"
> >  #include "submodule-config.h"
> > +#include "dir.h"
> >
> >  static const char * const git_reset_usage[] = {
> > N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q]
> []"),
> > @@ -124,8 +125,32 @@ static void update_index_from_diff(struct
> diff_queue_struct *q,
> >
> > for (i = 0; i < q->nr; i++) {
> > struct diff_filespec *one = q->queue[i]->one;
> > +   struct diff_filespec *two = q->queue[i]->two;
> > +   int pos;
> > int is_missing = !(one->mode && !is_null_oid(>oid));
> > +   int was_missing = !two->mode && is_null_oid(>oid);
> > struct cache_entry *ce;
> > +   struct cache_entry *ceBefore;
> > +   struct checkout state = CHECKOUT_INIT;
> 
> The five new variables are only used in the new block, so it
> probably is better to limit their scope to the "we do something
> unusual when sparse checkout is in effect" block as well.  The scope
> for the latter two can further be narrowed down to "we do need to
> force a checkout of this entry" block.
> 
> We prefer to name our variables with underscore (e.g. ce_before)
> over camelCase (e.g. ceBefore) unless there is a compelling reason
> (e.g. a platform specific code in compat/ layer to match platform
> convention).
> 

Will update.

> > +
> > +   if (core_apply_sparse_checkout && !file_exists(two->path)) {
> > +   pos = cache_name_pos(two->path, strlen(two->path));
> > +   if ((pos >= 0 && ce_skip_worktree(active_cache[pos]))
> &&
> > +   (is_missing || !was_missing))
> > +   {
> > +   state.force = 1;
> > +   state.refresh_cache = 1;
> > +   state.istate = _index;
> > +   ceBefore = make_cache_entry(two->mode,
> > +   two->oid.hash,
> > +   two->path, 0, 0);
> > +   if (!ceBefore)
> > +   die(_("make_cache_entry failed for path
> '%s'"),
> > +   two->path);
> > +
> > +   checkout_entry(ceBefore, , NULL);
> > +   }
> > +   }
> 
> Can we tell between the case where the reason why the path was not
> there in the working tree was due to the path being excluded by the
> sparse checkout and the path being removed manually by the end user?
> 
> I guess ce_skip_worktree() check is sufficient; we force checkout only
> when the path is marked to be skipped due to "sparse" thing.
> 
> Do we have to worry about the reverse case, in which file_exists(two->path)
> is true (i.e. the user created a file there manually) even though
> the path is marked to be skipped due to "sparse" setting?
> 

I don't believe so because if the user has a file there whether they modified
it or not, it is what the user did and we just leave it there and a diff with
what the index gets reset to will show how the file is different from what the
index got reset to.

> Other than that, the patch looks quite cleanly done and well explained.
> 
> Thanks.
> 


RE: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Kevin Willford

From: Junio C Hamano [mailto:gits...@pobox.com]
Sent: Friday, September 8, 2017 1:12 PM
> Kevin Willford  writes:
> 
> > When using the sparse checkout feature the git reset command will add
> > entries to the index that will have the skip-worktree bit off but will
> > leave the working directory empty.  File data is lost because the index
> > version of the files have been changed but there is nothing that is in the
> > working directory.  This will cause the next status call to show either
> > deleted for files modified or deleting or nothing for files added.
> 
> Getting rid of sparseness may of course be one way to correct the
> discrepancy, but it is unclear to me if that is the best way to do
> so.  An alternative may be to add entries to the index that does
> have the bit on for paths that are meant to be skipped due to
> "sparse" thing, no?  Am I being naive, missing some reason why that
> would give us a worse result?

There are two cases that this is trying to solve.

1. reset mixed when there were files that were added

In this case the index will no longer have the entry at all because
the reset is making the index look like before the file was added
which didn't have it. When not using the sparse-checkout this is fine
because the file is in the working directory and the reset didn't touch
it.  But using the sparse-checkout on that file and if the file was not
in the working directory, the index gets reset and the entry for that
file is gone and if we don't put the index version of the file before the
reset into the working directory, then we have lost the content for
that file

2. reset mixed when there were files modified

This case is similar but with modified files there is an entry in the index
but it is getting changed to a previous version of the file.  If we don't get
the file on disk then the version of the file that, in the non sparse-checkout
case, would be on disk is lost and cannot be recovered.  So even if we turn
on the skip-worktree bit for this entry and it doesn't show up in status
calls, we lost the previous version of the file.




Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Junio C Hamano
Kevin Willford  writes:

> When using the sparse checkout feature the git reset command will add
> entries to the index that will have the skip-worktree bit off but will
> leave the working directory empty.  File data is lost because the index
> version of the files have been changed but there is nothing that is in the
> working directory.  This will cause the next status call to show either
> deleted for files modified or deleting or nothing for files added.  

Getting rid of sparseness may of course be one way to correct the
discrepancy, but it is unclear to me if that is the best way to do
so.  An alternative may be to add entries to the index that does
have the bit on for paths that are meant to be skipped due to
"sparse" thing, no?  Am I being naive, missing some reason why that
would give us a worse result?


Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> +test_expect_success 'setup' '
>> +test_tick &&
>
> Do we need a test_tick here ?

As the test does not check against exact object names, and it does
not create commits, the order among which needs to be tiebroken by
using the committer timestamp, it is not strictly necessary, but I
do not think it would hurt, either.

>
>> +echo "checkout file" >c &&
>> +echo "modify file" >m &&
>> +echo "delete file" >d &&
>> +git add . &&
>> +git commit -m "initial commit" &&
>> +echo "added file" >a &&
>> +echo "modification of a file" >m &&
>> +git rm d &&
>> +git add . &&
>> +git commit -m "second commit" &&
>> +git checkout -b endCommit
>> +'
>> +
>> +test_expect_success 'reset when there is a sparse-checkout' '
>> +echo "/c" >.git/info/sparse-checkout &&
>> +test_config core.sparsecheckout true &&
>> +git checkout -b resetBranch &&
>> +test_path_is_missing m &&
>> +test_path_is_missing a &&
>> +test_path_is_missing d &&
>> +git reset HEAD~1 &&
>> +test "checkout file" = "$(cat c)" &&
>> +test "modification of a file" = "$(cat m)" &&
>> +test "added file" = "$(cat a)" &&
>> +test_path_is_missing d
>> +'
>> +


Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Junio C Hamano
Kevin Willford  writes:

> diff --git a/builtin/reset.c b/builtin/reset.c
> index d72c7d1c96..1b8bb45989 100644
> --- a/builtin/reset.c
> +++ b/builtin/reset.c
> @@ -24,6 +24,7 @@
>  #include "cache-tree.h"
>  #include "submodule.h"
>  #include "submodule-config.h"
> +#include "dir.h"
>  
>  static const char * const git_reset_usage[] = {
>   N_("git reset [--mixed | --soft | --hard | --merge | --keep] [-q] 
> []"),
> @@ -124,8 +125,32 @@ static void update_index_from_diff(struct 
> diff_queue_struct *q,
>  
>   for (i = 0; i < q->nr; i++) {
>   struct diff_filespec *one = q->queue[i]->one;
> + struct diff_filespec *two = q->queue[i]->two;
> + int pos;
>   int is_missing = !(one->mode && !is_null_oid(>oid));
> + int was_missing = !two->mode && is_null_oid(>oid);
>   struct cache_entry *ce;
> + struct cache_entry *ceBefore;
> + struct checkout state = CHECKOUT_INIT;

The five new variables are only used in the new block, so it
probably is better to limit their scope to the "we do something
unusual when sparse checkout is in effect" block as well.  The scope
for the latter two can further be narrowed down to "we do need to
force a checkout of this entry" block.

We prefer to name our variables with underscore (e.g. ce_before)
over camelCase (e.g. ceBefore) unless there is a compelling reason
(e.g. a platform specific code in compat/ layer to match platform
convention).

> +
> + if (core_apply_sparse_checkout && !file_exists(two->path)) {
> + pos = cache_name_pos(two->path, strlen(two->path));
> + if ((pos >= 0 && ce_skip_worktree(active_cache[pos])) &&
> + (is_missing || !was_missing))
> + {
> + state.force = 1;
> + state.refresh_cache = 1;
> + state.istate = _index;
> + ceBefore = make_cache_entry(two->mode,
> + two->oid.hash,
> + two->path, 0, 0);
> + if (!ceBefore)
> + die(_("make_cache_entry failed for path 
> '%s'"),
> + two->path);
> +
> + checkout_entry(ceBefore, , NULL);
> + }
> + }

Can we tell between the case where the reason why the path was not
there in the working tree was due to the path being excluded by the
sparse checkout and the path being removed manually by the end user?

I guess ce_skip_worktree() check is sufficient; we force checkout only
when the path is marked to be skipped due to "sparse" thing.

Do we have to worry about the reverse case, in which file_exists(two->path)
is true (i.e. the user created a file there manually) even though
the path is marked to be skipped due to "sparse" setting?

Other than that, the patch looks quite cleanly done and well explained.

Thanks.

> diff --git a/t/t7114-reset-sparse-checkout.sh 
> b/t/t7114-reset-sparse-checkout.sh
> new file mode 100755
> index 00..f2a5426847
> --- /dev/null
> +++ b/t/t7114-reset-sparse-checkout.sh
> @@ -0,0 +1,60 @@
> +#!/bin/sh
> +
> +test_description='reset when using a sparse-checkout'
> +
> +. ./test-lib.sh
> +
> +# reset using a sparse-checkout file
> +
> +test_expect_success 'setup' '
> + test_tick &&
> + echo "checkout file" >c &&
> + echo "modify file" >m &&
> + echo "delete file" >d &&
> + git add . &&
> + git commit -m "initial commit" &&
> + echo "added file" >a &&
> + echo "modification of a file" >m &&
> + git rm d &&
> + git add . &&
> + git commit -m "second commit" &&
> + git checkout -b endCommit
> +'
> +
> +test_expect_success 'reset when there is a sparse-checkout' '
> + echo "/c" >.git/info/sparse-checkout &&
> + test_config core.sparsecheckout true &&
> + git checkout -b resetBranch &&
> + test_path_is_missing m &&
> + test_path_is_missing a &&
> + test_path_is_missing d &&
> + git reset HEAD~1 &&
> + test "checkout file" = "$(cat c)" &&
> + test "modification of a file" = "$(cat m)" &&
> + test "added file" = "$(cat a)" &&
> + test_path_is_missing d
> +'
> +
> +test_expect_success 'reset after deleting file without skip-worktree bit' '
> + git checkout -f endCommit &&
> + git clean -xdf &&
> + cat >.git/info/sparse-checkout <<-\EOF &&
> + /c
> + /m
> + EOF
> + test_config core.sparsecheckout true &&
> + git checkout -b resetAfterDelete &&
> + test_path_is_file m &&
> + test_path_is_missing a &&
> + test_path_is_missing d &&
> + rm -f m &&
> + git reset HEAD~1 &&
> + test "checkout file" = "$(cat c)" &&
> + test "added file" = "$(cat a)" 

Re: [PATCH 1/1] reset: fix reset when using the sparse-checkout feature.

2017-09-08 Thread Torsten Bögershausen
On Fri, Sep 08, 2017 at 12:00:50PM -0600, Kevin Willford wrote:
> When using the sparse checkout feature the git reset command will add
> entries to the index that will have the skip-worktree bit off but will
> leave the working directory empty.  File data is lost because the index
> version of the files have been changed but there is nothing that is in the
> working directory.  This will cause the next status call to show either
> deleted for files modified or deleting or nothing for files added.  The
> added files should be shown as untracked and modified files should be
> shown as modified.
> 
> To fix this when the reset is running if there is not a file in the working
> directory and if it will be missing with the new index entry or was not
> missing in the previous version, we create the previous index version of
> the file in the working directory so that status will report correctly
> and the files will be availble for the user to deal with.
> 
> Signed-off-by: Kevin Willford 
[]
> diff --git a/t/t7114-reset-sparse-checkout.sh 
> b/t/t7114-reset-sparse-checkout.sh
> new file mode 100755
> index 00..f2a5426847
> --- /dev/null
> +++ b/t/t7114-reset-sparse-checkout.sh
> @@ -0,0 +1,60 @@
> +#!/bin/sh
> +
> +test_description='reset when using a sparse-checkout'
> +
> +. ./test-lib.sh
> +
> +# reset using a sparse-checkout file
> +
> +test_expect_success 'setup' '
> + test_tick &&

Do we need a test_tick here ?

> + echo "checkout file" >c &&
> + echo "modify file" >m &&
> + echo "delete file" >d &&
> + git add . &&
> + git commit -m "initial commit" &&
> + echo "added file" >a &&
> + echo "modification of a file" >m &&
> + git rm d &&
> + git add . &&
> + git commit -m "second commit" &&
> + git checkout -b endCommit
> +'
> +
> +test_expect_success 'reset when there is a sparse-checkout' '
> + echo "/c" >.git/info/sparse-checkout &&
> + test_config core.sparsecheckout true &&
> + git checkout -b resetBranch &&
> + test_path_is_missing m &&
> + test_path_is_missing a &&
> + test_path_is_missing d &&
> + git reset HEAD~1 &&
> + test "checkout file" = "$(cat c)" &&
> + test "modification of a file" = "$(cat m)" &&
> + test "added file" = "$(cat a)" &&
> + test_path_is_missing d
> +'
> +