Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
On Mon, Apr 23, 2018 at 4:46 PM, Junio C Hamano  wrote:
> Elijah Newren  writes:
>
>> Out of 53288 merge commits with exactly two parents in linux.git:
>>   - 48491 merged identically
>>   - 4737 merged the same other than a few different "Auto-merging
>> " output lines (as expected due to patch 35/36)
>>   - 53 merged the same other than different "Checking out files: ..."
>> output (I just did a plain merge; no flags like --no-progress)
>>   - the remaining 7 commits had non-trivial merge differences, all
>> attributable to directory rename detection kicking in
>>
>> So, it looks good to me.  If anyone has suggestions for other testing
>> to do, let me know.
>
> There must have been some merges that stopped due to conflicts among
> those 50k, and I am interested to hear how they were different.  Or
> are they included in the above numbers (e.g. among 48491 there were
> ones that stopped with conflicts, but the results these conflictted
> merge left in the working tree and the index were identical)?

They are included in the categories listed above.  What my comparison
did was for each of the 53288 commits:

1) Do the merge, capture stdout and stderr, and the exit status
2) Record output of 'git ls-files -s'
3) Record output of 'git status | grep -v detached'
4) Record contents of every untracked file (could be created e.g. due
to D/F conflicts)
5) Record contents of 'git diff -M --staged'
6) Record contents of 'git diff -M'
(all of this stuff in 1-6 is recorded into a single text file with
some nice headers to split the sections up).

7) Repeat steps 1-6 with the new version of git, but recording into a
different filename
8) Compare the two text files to see what was different between the
two merges, if anything.
(If they are different, save the files somewhere for me to look at later.)


Then after each merge, there's a bunch of cleanup to make sure things
are in a pristine state for the next merge.


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Junio C Hamano
Elijah Newren  writes:

> Out of 53288 merge commits with exactly two parents in linux.git:
>   - 48491 merged identically
>   - 4737 merged the same other than a few different "Auto-merging
> " output lines (as expected due to patch 35/36)
>   - 53 merged the same other than different "Checking out files: ..."
> output (I just did a plain merge; no flags like --no-progress)
>   - the remaining 7 commits had non-trivial merge differences, all
> attributable to directory rename detection kicking in
>
> So, it looks good to me.  If anyone has suggestions for other testing
> to do, let me know.

There must have been some merges that stopped due to conflicts among
those 50k, and I am interested to hear how they were different.  Or
are they included in the above numbers (e.g. among 48491 there were
ones that stopped with conflicts, but the results these conflictted
merge left in the working tree and the index were identical)?


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
Hi Junio,

On Thu, Apr 19, 2018 at 8:05 PM, Junio C Hamano  wrote:
>> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
>>> This series is a reboot of the directory rename detection series that was
>>> merged to master and then reverted due to the final patch having a buggy
>>> can-skip-update check, as noted at
>>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>>> This series based on top of master.
>
> The series as-is is fine, I think, from the maintainer's point of
> view.  Thanks.

Sorry to be a pest, but now I'm unsure how I should handle the next
round.  I've got:
- two minor fixup commits that can be trivially squashed in (not yet
sent), affecting just the final few patches
- a "year" vs "years" typo in commit message of patch 32 (which is now
in pu as commit 3daa9b3eb6dd)
- an (independent-ish) unpack_trees fix (Message-ID:
20180421193736.12722-1-new...@gmail.com), possibly to be supplemented
by another fix/improvement suggested by Duy

Should I...
- send out a reroll of everything, and include the unpack_trees
fix(es) in the series?
- just resend patches 32-36 with the fixes, and renumber the patches
to include the unpack_trees stuff in the middle?
- just send the two fixup commits, ignore the minor typo, and keep the
unpack_trees fix(es) as a separate topic that we'll just want to
advance first?
- something else?

Thanks,
Elijah


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-23 Thread Elijah Newren
On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
> Additional testing:
>
>   * I've re-merged all ~13k merge commits in git.git with both
> git-2.17.0 and this version of git, comparing the results to each
> other in detail.  (Including stdout & stderr, as well as the output
> of subsequent commands like `git status`, `git ls-files -s`, `git
> diff -M`, `git diff -M --staged`).  The only differences were in 23
> merges of either git-gui or gitk which involved directory renames
> (e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
> or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
> 'gitk-git/po/ru.po')
>
>   * I'm trying to do the same with linux.git, but it looks like that will
> take nearly a week to complete...

Results after restarting[1] and throwing some big hardware at it to
get faster completion:

Out of 53288 merge commits with exactly two parents in linux.git:
  - 48491 merged identically
  - 4737 merged the same other than a few different "Auto-merging
" output lines (as expected due to patch 35/36)
  - 53 merged the same other than different "Checking out files: ..."
output (I just did a plain merge; no flags like --no-progress)
  - the remaining 7 commits had non-trivial merge differences, all
attributable to directory rename detection kicking in

So, it looks good to me.  If anyone has suggestions for other testing
to do, let me know.

[1] Restarted so it could include my unpack_trees fix (from
message-id20180421193736.12722-1-new...@gmail.com) plus a couple minor
fixup commits (fixing some testcase nits and a comment typo).


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Junio C Hamano
Elijah Newren  writes:

> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

The series as-is is fine, I think, from the maintainer's point of
view.  Thanks.


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren  wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

Sorry, user error; there are no conflicts with my series.

(I accidentally included Junio's interim round of my own series and
while trying to spot problems I saw commits from these other series
touching relevant files in what looked like nearby areas.  Directly
merging with these other two series or even merging all of pu before
en/rename-directory-detection-reboot followed by individually merging
later series has no conflicts with any of my changes.)


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Derrick Stolee

On 4/19/2018 2:41 PM, Stefan Beller wrote:

On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren  wrote:

On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:

This series is a reboot of the directory rename detection series that was
merged to master and then reverted due to the final patch having a buggy
can-skip-update check, as noted at
   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
This series based on top of master.

...and merges cleanly to next but apparently has some minor conflicts
with both ds/lazy-load-trees and ps/test-chmtime-get from pu.

What's the preferred way to resolve this?  Rebase and resubmit my
series on pu, or something else?

If you were to base it off of pu, this series would depend on all other
series that pu contains. This is bad for the progress of this series.
(If it were to be merged to next, all other series would automatically
merge to next as well)

If the conflicts are minor, then Junio resolves them; if you want to be
nice, pick your merge point as

 git checkout origin/master
 git merge ds/lazy-load-trees
 git merge ps/test-chmtime-get
 git tag my-anchor

and put the series on top of that anchor.

If you do this, you'd want to be reasonably sure that
those two series are not in too much flux.


I believe ds/lazy-load-trees is queued for 'next'. I'm not surprised 
that there are some conflicts here. Any reference to the 'tree' member 
of a commit should be replaced with 'get_commit_tree(c)', or 
'get_commit_tree_oid(c)' if you only care about the tree's object id.


I think Stefan's suggestion is the best approach to get the right 
conflicts out of the way.


Thanks,
-Stolee


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Stefan Beller
On Thu, Apr 19, 2018 at 11:35 AM, Elijah Newren  wrote:
> On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
>> This series is a reboot of the directory rename detection series that was
>> merged to master and then reverted due to the final patch having a buggy
>> can-skip-update check, as noted at
>>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
>> This series based on top of master.
>
> ...and merges cleanly to next but apparently has some minor conflicts
> with both ds/lazy-load-trees and ps/test-chmtime-get from pu.
>
> What's the preferred way to resolve this?  Rebase and resubmit my
> series on pu, or something else?

If you were to base it off of pu, this series would depend on all other
series that pu contains. This is bad for the progress of this series.
(If it were to be merged to next, all other series would automatically
merge to next as well)

If the conflicts are minor, then Junio resolves them; if you want to be
nice, pick your merge point as

git checkout origin/master
git merge ds/lazy-load-trees
git merge ps/test-chmtime-get
git tag my-anchor

and put the series on top of that anchor.

If you do this, you'd want to be reasonably sure that
those two series are not in too much flux.

Thanks,
Stefan


Re: [PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
On Thu, Apr 19, 2018 at 10:57 AM, Elijah Newren  wrote:
> This series is a reboot of the directory rename detection series that was
> merged to master and then reverted due to the final patch having a buggy
> can-skip-update check, as noted at
>   https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
> This series based on top of master.

...and merges cleanly to next but apparently has some minor conflicts
with both ds/lazy-load-trees and ps/test-chmtime-get from pu.

What's the preferred way to resolve this?  Rebase and resubmit my
series on pu, or something else?


[PATCH v10 00/36] Add directory rename detection to git

2018-04-19 Thread Elijah Newren
This series is a reboot of the directory rename detection series that was
merged to master and then reverted due to the final patch having a buggy
can-skip-update check, as noted at
  https://public-inbox.org/git/xmqqmuya43cs@gitster-ct.c.googlers.com/
This series based on top of master.

This updated series fixes the problem found with the previous series, and
also fixes Linus' issue with unnecessary rebuilds noted at
  
https://public-inbox.org/git/CA+55aFzLZ3UkG5svqZwSnhNk75=fxjrkvu1m_rhbg54nooa...@mail.gmail.com/

For the original details about design considerations surrounding
directory rename detection, see
  https://public-inbox.org/git/20171110190550.27059-1-new...@gmail.com/

Patches 1--28 are identical to what was previously merged to master,
modulo trivial compilation fixes due to the fact that I've rebased on
master which now includes commit 916bc35b29af ("tree-walk: convert tree
entry functions to object_id", 2018-03-12).  As such, I've retained the
Reviewed-by and Signed-off-by tags for these first 28 patches.  (The
final patch of the original series, patch 29, has been rewritten and
replaced in this series.)

The remaining eight patches are new; a brief summary:

  merge-recursive: improve add_cacheinfo error handling
  merge-recursive: move more is_dirty handling to merge_content
  merge-recursive: avoid triggering add_cacheinfo error with dirty mod

When Junio was bit by the previous series, the code reached a
detected error state that should not ever be hit in production.
That was bad enough, but the problem compounded because the code
simply printed a vague not-very-scary-sounding error, and returned
an error code that the caller ignored (which not only proceeded to
then handle other paths which might print messages causing the error
to scroll off the screen, but could result in a "clean" merge).  Fix
issues with the error handling...and then deal with the breakage of
one particular test that was triggering this codepath.

  t6046: testcases checking whether updates can be skipped in a merge

Add a fairly comprehensive set of tests for the skipability of
working tree updates.

  merge-recursive: fix was_tracked() to quit lying with some renamed
paths
  merge-recursive: fix remainder of was_dirty() to use original index

Instead of using the current index as a (rather imperfect) proxy for
the state of the index just before the merge, keep a copy of the
original index around so we can get correct answers to whether
certain paths were tracked or dirty before the merge.

  merge-recursive: make "Auto-merging" comment show for other merges
  merge-recursive: fix check for skipability of working tree updates

Fix and simplify the skipability check.  Due to some tests being
picky about output, the first of these two patches exists to avoid
triggering the "Auto-merging $FILE" message too often with the
simplified logic; in the process, it fixes a pair of existing issues
with when those messages are shown, making it more accurate in
general.

Additional testing:

  * I've re-merged all ~13k merge commits in git.git with both
git-2.17.0 and this version of git, comparing the results to each
other in detail.  (Including stdout & stderr, as well as the output
of subsequent commands like `git status`, `git ls-files -s`, `git
diff -M`, `git diff -M --staged`).  The only differences were in 23
merges of either git-gui or gitk which involved directory renames
(e.g. git-2.17.0's merge would result in files like 'lib/tools.tcl'
or 'po/ru.po' instead of the expected 'git-gui/lib/tools.tcl' or
'gitk-git/po/ru.po')

  * I'm trying to do the same with linux.git, but it looks like that will
take nearly a week to complete...

My biggest question:

  * Is there any other testing others would like to see, in order to avoid
a repeat of the pain from my previous series and allow us to safely
merge this newer one?

Elijah Newren (36):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
far
  directory rename detection: partially renamed directory
testcase/discussion
  directory rename detection: files/directories in the way of some
renames
  directory rename detection: testcases checking which side did the
rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
merges
  directory rename detection: miscellaneous testcases to complete
coverage
  directory rename detection: tests for handling overwriting untracked
files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated