Re: Line ending normalization doesn't work as expected

2018-02-16 Thread Robert Dailey
On Fri, Feb 16, 2018 at 10:34 AM, Torsten Bögershausen  wrote:
> On Thu, Feb 15, 2018 at 09:24:40AM -0600, Robert Dailey wrote:
>> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
>
> []
>>
>> Sorry to bring this old thread back to life, but I did notice that
>> this causes file modes to reset back to 644 (from 755) on Windows
>> version of Git. Is there a way to `$ git read-tree --empty && git add
>> .` without mucking with file permissions?
>
> No problem with the delay, under the time we had the chance to improve Git:
>
>>Git 2.16 Release Notes
>>==
>>[]
>>* "git add --renormalize ." is a new and safer way to record the fact
>>   that you are correcting the end-of-line convention and other
>>   "convert_to_git()" glitches in the in-repository data.
>
> Could you upgrade to Git 2.16.1 (or higher, just take the latest)
> and try with
> git add --renormalize .
> ?

Thanks for the response. Unfortunately I've deliberately been stuck on
v2.13 because of [a bug in Git for Windows][1] that hasn't yet been
resolved (it's a bug *somewhere*, not sure if it's git related or
not). Curly braces in aliases are being stripped which makes newer
releases unusable for me. I'll try upgrading on a different machine
and see if renormalize works for the case of binary files with file
modes set to 755.

[1]: https://github.com/git-for-windows/git/issues/1220


Re: Line ending normalization doesn't work as expected

2018-02-16 Thread Torsten Bögershausen
On Thu, Feb 15, 2018 at 09:24:40AM -0600, Robert Dailey wrote:
> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:

[]
> 
> Sorry to bring this old thread back to life, but I did notice that
> this causes file modes to reset back to 644 (from 755) on Windows
> version of Git. Is there a way to `$ git read-tree --empty && git add
> .` without mucking with file permissions?

No problem with the delay, under the time we had the chance to improve Git:

>Git 2.16 Release Notes
>==
>[]
>* "git add --renormalize ." is a new and safer way to record the fact
>   that you are correcting the end-of-line convention and other
>   "convert_to_git()" glitches in the in-repository data.

Could you upgrade to Git 2.16.1 (or higher, just take the latest)
and try with
git add --renormalize .
?


Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Robert Dailey
On Thu, Feb 15, 2018 at 1:16 PM, Junio C Hamano  wrote:
> I think the message you are referring to is a tangent that discusses
> how it was done in the old world, with issues that come from the
> fact that with such an approach the paths are first removed from the
> index and then added afresh to the index, which can lose cases and
> executable bits when working on a filesystem that does not retain
> enough information.
>
> The way in the new world is to use "add --renormalize" which was
> added at 9472935d ("add: introduce "--renormalize"", 2017-11-16), I
> think.

Oh I didn't realize someone actually did it. If so, that's awesome.
Thanks Junio!


Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Junio C Hamano
Robert Dailey  writes:

> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
>> Torsten Bögershausen  writes:
>>
 $ git rm -r --cached . && git add .
>>>
>>> (Both should work)
>>>
>>> To be honest, from the documentation, I can't figure out the difference 
>>> between
>>> $ git read-tree --empty
>>> and
>>> $ git rm -r --cached .
>>>
>>> Does anybody remember the discussion, why we ended up with read-tree ?
>> ...
>
> Sorry to bring this old thread back to life, but I did notice that
> this causes file modes to reset back to 644 (from 755) on Windows
> version of Git. Is there a way to `$ git read-tree --empty && git add
> .` without mucking with file permissions?

I think the message you are referring to is a tangent that discusses
how it was done in the old world, with issues that come from the
fact that with such an approach the paths are first removed from the
index and then added afresh to the index, which can lose cases and
executable bits when working on a filesystem that does not retain
enough information.

The way in the new world is to use "add --renormalize" which was
added at 9472935d ("add: introduce "--renormalize"", 2017-11-16), I
think.



Re: Line ending normalization doesn't work as expected

2018-02-15 Thread Robert Dailey
On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> Torsten Bögershausen  writes:
>
>>> $ git rm -r --cached . && git add .
>>
>> (Both should work)
>>
>> To be honest, from the documentation, I can't figure out the difference 
>> between
>> $ git read-tree --empty
>> and
>> $ git rm -r --cached .
>>
>> Does anybody remember the discussion, why we ended up with read-tree ?
>
> We used to use neither, and considered it fine to "rm .git/index" if
> you wanted to empty the on-disk index file in the old world.  In the
> modern world, folks want you to avoid touching filesystem directly
> and instead want you to use Git tools, and the above are two obvious
> ways to do so.
>
> "git read-tree" (without any parameter, i.e. "read these 0 trees and
> populate the index with it") and its modern and preferred synonym
> "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> sole effect of this command is to empty the index.") are more direct
> ways to express "I want the index emptied" between the two.
>
> The other one, "git rm -r --cached .", in the end gives you the same
> state because it tells Git to "iterate over all the entries in the
> index, find the ones that match pathspec '.', and remove them from
> the index.".  It is not wrong per-se, but conceptually it is a bit
> roundabout way to say that "I want the index emptied", I would
> think.
>
> I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> due to the overhead of having to do the pathspec filtering that ends
> up to be a no-op, but there shouldn't be a difference in the end
> result.

Sorry to bring this old thread back to life, but I did notice that
this causes file modes to reset back to 644 (from 755) on Windows
version of Git. Is there a way to `$ git read-tree --empty && git add
.` without mucking with file permissions?


Re: Line ending normalization doesn't work as expected

2017-10-06 Thread Torsten Bögershausen
On Fri, Oct 06, 2017 at 09:33:31AM +0900, Junio C Hamano wrote:
> Torsten Bögershausen  writes:
> 
> > Before we put this into stone:
> > Does it make sense to say "renormalize" instead of "rehash" ?
> > (That term does exist already for merge.
> >  And rehash is more a technical term,  rather then a user-point-of-view 
> > explanation)
> 
> I do not mind "renormalize" at all.
> 
> As to the toy patch, I think it needs to (at least by default) turn
> off the add_new_files codepath, and be allowed to work without any
> pathspec (in which case all tracked paths should be renormalized).
> 

OK, then I will pick up your patch in a couple of days/weeks, and push it 
further then
(Documentation, test cases, other ?)




Re: Line ending normalization doesn't work as expected

2017-10-05 Thread Junio C Hamano
Torsten Bögershausen  writes:

> Before we put this into stone:
> Does it make sense to say "renormalize" instead of "rehash" ?
> (That term does exist already for merge.
>  And rehash is more a technical term,  rather then a user-point-of-view 
> explanation)

I do not mind "renormalize" at all.

As to the toy patch, I think it needs to (at least by default) turn
off the add_new_files codepath, and be allowed to work without any
pathspec (in which case all tracked paths should be renormalized).

And we really shouldn't do the "rm && add", which would not work
well on platforms where filesystem without executing-bit support is
prevalent.  This new feature is primarily needed on platforms where
CRLF line endings are used, and unfortunately these two sets of
platforms overlap quite a bit X-<.




Re: Line ending normalization doesn't work as expected

2017-10-05 Thread Torsten Bögershausen
 
>  builtin/add.c | 42 +-
>  1 file changed, 41 insertions(+), 1 deletion(-)
> 
> diff --git a/builtin/add.c b/builtin/add.c
> index 5d5773d5cd..264f84dbe7 100644
> --- a/builtin/add.c
> +++ b/builtin/add.c
> @@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
>  };
>  static int patch_interactive, add_interactive, edit_interactive;
>  static int take_worktree_changes;
> +static int rehash;
>  
>  struct update_callback_data {
>   int flags;
> @@ -121,6 +122,41 @@ int add_files_to_cache(const char *prefix,
>   return !!data.add_errors;
>  }
>  
> +static int rehash_tracked_files(const char *prefix, const struct pathspec 
> *pathspec,
> + int flags)
> +{
> + struct string_list paths = STRING_LIST_INIT_DUP;
> + struct string_list_item *path;
> + int i, retval = 0;
> +
> + for (i = 0; i < active_nr; i++) {
> + struct cache_entry *ce = active_cache[i];
> +
> + if (ce_stage(ce))
> + continue; /* do not touch unmerged paths */
> + if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
> + continue; /* do not touch non blobs */
> + if (pathspec && !ce_path_match(ce, pathspec, NULL))
> + continue;
> + string_list_append(, ce->name);
> + }
> +
> + for_each_string_list_item(path, ) {
> + /*
> +  * Having a blob contaminated with CR will trigger the
> +  * safe-crlf kludge, avoidance of which is the primary
> +  * thing this helper function exists.  Remove it and
> +  * then re-add it.  Note that this may lose executable
> +  * bit on a filesystem that lacks it.
> +  */
> + remove_file_from_cache(path->string);
> + add_file_to_cache(path->string, flags);
> + }
> +
> + string_list_clear(, 0);
> + return retval;
> +}
> +
>  static char *prune_directory(struct dir_struct *dir, struct pathspec 
> *pathspec, int prefix)
>  {
>   char *seen;
> @@ -274,6 +310,7 @@ static struct option builtin_add_options[] = {
>   OPT_BOOL('e', "edit", _interactive, N_("edit current diff and 
> apply")),
>   OPT__FORCE(_too, N_("allow adding otherwise ignored files")),
>   OPT_BOOL('u', "update", _worktree_changes, N_("update tracked 
> files")),
> + OPT_BOOL(0, "rehash", , N_("really update tracked files")),
>   OPT_BOOL('N', "intent-to-add", _to_add, N_("record only the fact 
> that the path will be added later")),
>   OPT_BOOL('A', "all", _explicit, N_("add changes from all 
> tracked and untracked files")),
>   { OPTION_CALLBACK, 0, "ignore-removal", _explicit,
> @@ -498,7 +535,10 @@ int cmd_add(int argc, const char **argv, const char 
> *prefix)
>  
>   plug_bulk_checkin();
>  
> - exit_status |= add_files_to_cache(prefix, , flags);
> + if (rehash)
> + exit_status |= rehash_tracked_files(prefix, , flags);
> + else
> + exit_status |= add_files_to_cache(prefix, , flags);
>  
>   if (add_new_files)
>   exit_status |= add_files(, flags);

That looks like a nice one.
Before we put this into stone:
Does it make sense to say "renormalize" instead of "rehash" ?
(That term does exist already for merge.
 And rehash is more a technical term,  rather then a user-point-of-view 
explanation)
 


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Junio C Hamano  writes:

> Both this and its "git read-tree --empty" cousin share a grave
> issue.  The "git add ." step would mean that before doing these
> commands, your working tree must be truly clean, i.e. the paths
> in the filesystem known to the index must match what is in the
> index (modulo the line-ending gotcha you are trying to correct), 
> *AND* there must be *NO* untracked paths you do not want to add
> in the working tree.
>
> That is a reason why we should solve it differently.  Perhaps adding
> a new option "git add --rehash" to tell Git "Hey, you may think some
> paths in the index and in the working tree are identical and no need
> to re-register, but you are WRONG.  For each path in the index,
> remove it and then register the object by hashing the contents from
> the filesystem afresh!" would be the best way to go.

Here is just to illustrate the direction I was heading to in the
above.  This is not even compile tested and I won't guarantee what
corner cases there are, though.

In a true production code, we shouldn't be using string-list with
two loops, but I just didn't want to spend more braincycles worrying
about removing from the list and then adding to it, both inside a
single loop that iterates over it in a mere illustration patch.

The second loop uses a simple "remove then add", but I think it
should rather be a "mark ce that it will _never_ match anything on
the working tree" followed by "add_file_to_cache()".  Currently we
do not have the "mark ce that it never matches" operation that lets
us bypass the comparison with the current cache entry (with safecrlf
thing that interferes), but we can afford to use a (in-core only)
bit in the ce flags word to represent this and plumb it through.
That way, we will still preserve the executable bit from the
original entry, hopefully ;-)


 builtin/add.c | 42 +-
 1 file changed, 41 insertions(+), 1 deletion(-)

diff --git a/builtin/add.c b/builtin/add.c
index 5d5773d5cd..264f84dbe7 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -26,6 +26,7 @@ static const char * const builtin_add_usage[] = {
 };
 static int patch_interactive, add_interactive, edit_interactive;
 static int take_worktree_changes;
+static int rehash;
 
 struct update_callback_data {
int flags;
@@ -121,6 +122,41 @@ int add_files_to_cache(const char *prefix,
return !!data.add_errors;
 }
 
+static int rehash_tracked_files(const char *prefix, const struct pathspec 
*pathspec,
+   int flags)
+{
+   struct string_list paths = STRING_LIST_INIT_DUP;
+   struct string_list_item *path;
+   int i, retval = 0;
+
+   for (i = 0; i < active_nr; i++) {
+   struct cache_entry *ce = active_cache[i];
+
+   if (ce_stage(ce))
+   continue; /* do not touch unmerged paths */
+   if (!S_ISREG(ce->ce_mode) && !S_ISLNK(ce->ce_mode))
+   continue; /* do not touch non blobs */
+   if (pathspec && !ce_path_match(ce, pathspec, NULL))
+   continue;
+   string_list_append(, ce->name);
+   }
+
+   for_each_string_list_item(path, ) {
+   /*
+* Having a blob contaminated with CR will trigger the
+* safe-crlf kludge, avoidance of which is the primary
+* thing this helper function exists.  Remove it and
+* then re-add it.  Note that this may lose executable
+* bit on a filesystem that lacks it.
+*/
+   remove_file_from_cache(path->string);
+   add_file_to_cache(path->string, flags);
+   }
+
+   string_list_clear(, 0);
+   return retval;
+}
+
 static char *prune_directory(struct dir_struct *dir, struct pathspec 
*pathspec, int prefix)
 {
char *seen;
@@ -274,6 +310,7 @@ static struct option builtin_add_options[] = {
OPT_BOOL('e', "edit", _interactive, N_("edit current diff and 
apply")),
OPT__FORCE(_too, N_("allow adding otherwise ignored files")),
OPT_BOOL('u', "update", _worktree_changes, N_("update tracked 
files")),
+   OPT_BOOL(0, "rehash", , N_("really update tracked files")),
OPT_BOOL('N', "intent-to-add", _to_add, N_("record only the fact 
that the path will be added later")),
OPT_BOOL('A', "all", _explicit, N_("add changes from all 
tracked and untracked files")),
{ OPTION_CALLBACK, 0, "ignore-removal", _explicit,
@@ -498,7 +535,10 @@ int cmd_add(int argc, const char **argv, const char 
*prefix)
 
plug_bulk_checkin();
 
-   exit_status |= add_files_to_cache(prefix, , flags);
+   if (rehash)
+   exit_status |= rehash_tracked_files(prefix, , flags);
+   else
+   exit_status |= add_files_to_cache(prefix, , flags);
 
if (add_new_files)
exit_status |= add_files(, flags);


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Jonathan Nieder
Junio C Hamano wrote:
> Jonathan Nieder  writes:

>>  git checkout --renormalize .
>>  git status; # Show files that will be normalized
>>  git commit; # Commit the result
>>
>> What do you think?  Would you be interested in writing a patch for it?
>> ("No" is as always an acceptable answer.)
>
> I actually think what is being requested is the opposite, i.e. "the
> object registered in the index have wrong line endings, and the
> safe-crlf is getting in the way to prevent me from correcting by
> hashing the working tree contents again to register contents with
> corrected line endings, even with 'git add .'".
>
> So I would understand if your suggestion were for
>
>   git checkin --renormalize .
>
> but not "git checkout".  And it probably is more familiar to lay
> people if we spelled that as "git add --renormalize ." ;-)

Good catch.  You understood correctly --- "git add --renormalize" is
the feature that I think is being hinted at here.

Thanks,
Jonathan


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Torsten Bögershausen  writes:

> One solution, which you can tell your team, is this one:
> $ git rm -r --cached . && git add .

Both this and its "git read-tree --empty" cousin share a grave
issue.  The "git add ." step would mean that before doing these
commands, your working tree must be truly clean, i.e. the paths
in the filesystem known to the index must match what is in the
index (modulo the line-ending gotcha you are trying to correct), 
*AND* there must be *NO* untracked paths you do not want to add
in the working tree.

That is a reason why we should solve it differently.  Perhaps adding
a new option "git add --rehash" to tell Git "Hey, you may think some
paths in the index and in the working tree are identical and no need
to re-register, but you are WRONG.  For each path in the index,
remove it and then register the object by hashing the contents from
the filesystem afresh!" would be the best way to go.  That will not
pick up untracked paths left in the filesystem, and does not limit
our solution to the "eol normalization is screwey" issue by not
calling the option "renormalize" or any other words that imply "why"
we are hashing again anew.



Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Junio C Hamano
Jonathan Nieder  writes:

> I suspect what we are dancing around is the need for some command like
>
>   git checkout --renormalize .
>
> which would shorten the sequence to
>
>   git checkout --renormalize .
>   git status; # Show files that will be normalized
>   git commit; # Commit the result
>
> What do you think?  Would you be interested in writing a patch for it?
> ("No" is as always an acceptable answer.)

I actually think what is being requested is the opposite, i.e. "the
object registered in the index have wrong line endings, and the
safe-crlf is getting in the way to prevent me from correcting by
hashing the working tree contents again to register contents with
corrected line endings, even with 'git add .'".

So I would understand if your suggestion were for

git checkin --renormalize .

but not "git checkout".  And it probably is more familiar to lay
people if we spelled that as "git add --renormalize ." ;-)




Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Torsten Bögershausen
On Wed, Oct 04, 2017 at 11:26:55AM -0500, Robert Dailey wrote:
> On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> > Torsten Bögershausen  writes:
> >
> >>> $ git rm -r --cached . && git add .
> >>
> >> (Both should work)
> >>
> >> To be honest, from the documentation, I can't figure out the difference 
> >> between
> >> $ git read-tree --empty
> >> and
> >> $ git rm -r --cached .
> >>
> >> Does anybody remember the discussion, why we ended up with read-tree ?
> >
> > We used to use neither, and considered it fine to "rm .git/index" if
> > you wanted to empty the on-disk index file in the old world.  In the
> > modern world, folks want you to avoid touching filesystem directly
> > and instead want you to use Git tools, and the above are two obvious
> > ways to do so.
> >
> > "git read-tree" (without any parameter, i.e. "read these 0 trees and
> > populate the index with it") and its modern and preferred synonym
> > "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> > sole effect of this command is to empty the index.") are more direct
> > ways to express "I want the index emptied" between the two.
> >
> > The other one, "git rm -r --cached .", in the end gives you the same
> > state because it tells Git to "iterate over all the entries in the
> > index, find the ones that match pathspec '.', and remove them from
> > the index.".  It is not wrong per-se, but conceptually it is a bit
> > roundabout way to say that "I want the index emptied", I would
> > think.
> >
> > I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> > due to the overhead of having to do the pathspec filtering that ends
> > up to be a no-op, but there shouldn't be a difference in the end
> > result.
> 
> You guys are obviously worlds ahead of me on the internals of things,
> but from my perspective I like to avoid the "plumbing" commands as
> much as I can. Even if I used them, if I have to tell the rest of my
> team "this is the way to do it", they're going to give me dirty looks
> if I ask them to run things like this that make no sense to them.
> That's the argument I have to deal with when it comes to Git's
> usability within the team I manage. So based on this, I'd favor the
> `git rm -r --cached` approach because this is the more common result
> you see in Google, and also makes a little more sense from a high
> level of thought perspective. However, this is just my personal
> opinion. `read-tree --empty` is far less self explanatory IMHO.
> 
> Also let's not forget the second part of the command chain that
> results in the different behavior. In one case, I use `git add` which
> results in proper line ending normalization. In the other case, I do
> `git reset --hard` which does *NOT* result in the line endings
> normalized (`git status` shows no results). In both cases, I'm still
> doing `git rm -r --cached`, so I am doubtful that is the root cause
> for the line ending normalization piece. I'm still trying to
> understand why both give different results (root cause) and also get
> an understanding of what the correct (modern) solution is for line
> ending normalization (not necessarily which is the right way to
> clear/delete the index, which is really AFAIK just a means to this end
> and an implementation detail of sorts for this specific task).

Hopefully I am able to give a useful answer.

"git reset --hard" works like a hammer
and may destroy work that has been done,
in our case the cleaning of the index,
which is needed for normalization since Git 2.10 (or so)

Back to the question:
One solution, which you can tell your team, is this one:
$ git rm -r --cached . && git add .

And as Junio pointed out, this may be slower than needed.
And we don't want "slow" solutions in the official documentation ;-)

Whatever you find on search engines may get stale after a while,
so that we appreciate direct questions here.

(And I will open an issue on Github the next days)

The background is that the CRLF handling in Git changed over the years,
and one effect is that "git reset" is not "allowed" any more.

For the interested reader:
https://github.com/git-for-windows/git/issues/954



Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Robert Dailey
On Wed, Oct 4, 2017 at 11:59 AM, Jonathan Nieder  wrote:
> Hi Robert,
>
> Robert Dailey wrote:
>
>> You guys are obviously worlds ahead of me on the internals of things,
>> but from my perspective I like to avoid the "plumbing" commands as
>> much as I can.
>
> I suspect what we are dancing around is the need for some command like
>
> git checkout --renormalize .
>
> which would shorten the sequence to
>
> git checkout --renormalize .
> git status; # Show files that will be normalized
> git commit; # Commit the result
>
> What do you think?  Would you be interested in writing a patch for it?
> ("No" is as always an acceptable answer.)

I wish I could, but ultimately I'd probably not be able to do it. I
rarely have time to do recreational coding outside of work these days.

That aside, for now I want to know the proper & recommended method to
renormalize line endings using existing commands. Additionally I also
am interested in knowing why only 1 of the 3 solutions I tried (In my
OP) worked but the others didn't. My short term goal is just to get
educated a bit. There's so much conflicting and variable information
on this topic on Google. It makes it difficult to find the one true
path, especially since Git evolves and improves over time and
information usually becomes stale.


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Jonathan Nieder
Hi Robert,

Robert Dailey wrote:

> You guys are obviously worlds ahead of me on the internals of things,
> but from my perspective I like to avoid the "plumbing" commands as
> much as I can.

I suspect what we are dancing around is the need for some command like

git checkout --renormalize .

which would shorten the sequence to

git checkout --renormalize .
git status; # Show files that will be normalized
git commit; # Commit the result

What do you think?  Would you be interested in writing a patch for it?
("No" is as always an acceptable answer.)

Thanks,
Jonathan


Re: Line ending normalization doesn't work as expected

2017-10-04 Thread Robert Dailey
On Tue, Oct 3, 2017 at 9:00 PM, Junio C Hamano  wrote:
> Torsten Bögershausen  writes:
>
>>> $ git rm -r --cached . && git add .
>>
>> (Both should work)
>>
>> To be honest, from the documentation, I can't figure out the difference 
>> between
>> $ git read-tree --empty
>> and
>> $ git rm -r --cached .
>>
>> Does anybody remember the discussion, why we ended up with read-tree ?
>
> We used to use neither, and considered it fine to "rm .git/index" if
> you wanted to empty the on-disk index file in the old world.  In the
> modern world, folks want you to avoid touching filesystem directly
> and instead want you to use Git tools, and the above are two obvious
> ways to do so.
>
> "git read-tree" (without any parameter, i.e. "read these 0 trees and
> populate the index with it") and its modern and preferred synonym
> "git read-tree --empty" (i.e. "I am giving 0 trees and I know the
> sole effect of this command is to empty the index.") are more direct
> ways to express "I want the index emptied" between the two.
>
> The other one, "git rm -r --cached .", in the end gives you the same
> state because it tells Git to "iterate over all the entries in the
> index, find the ones that match pathspec '.', and remove them from
> the index.".  It is not wrong per-se, but conceptually it is a bit
> roundabout way to say that "I want the index emptied", I would
> think.
>
> I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
> due to the overhead of having to do the pathspec filtering that ends
> up to be a no-op, but there shouldn't be a difference in the end
> result.

You guys are obviously worlds ahead of me on the internals of things,
but from my perspective I like to avoid the "plumbing" commands as
much as I can. Even if I used them, if I have to tell the rest of my
team "this is the way to do it", they're going to give me dirty looks
if I ask them to run things like this that make no sense to them.
That's the argument I have to deal with when it comes to Git's
usability within the team I manage. So based on this, I'd favor the
`git rm -r --cached` approach because this is the more common result
you see in Google, and also makes a little more sense from a high
level of thought perspective. However, this is just my personal
opinion. `read-tree --empty` is far less self explanatory IMHO.

Also let's not forget the second part of the command chain that
results in the different behavior. In one case, I use `git add` which
results in proper line ending normalization. In the other case, I do
`git reset --hard` which does *NOT* result in the line endings
normalized (`git status` shows no results). In both cases, I'm still
doing `git rm -r --cached`, so I am doubtful that is the root cause
for the line ending normalization piece. I'm still trying to
understand why both give different results (root cause) and also get
an understanding of what the correct (modern) solution is for line
ending normalization (not necessarily which is the right way to
clear/delete the index, which is really AFAIK just a means to this end
and an implementation detail of sorts for this specific task).


Re: Line ending normalization doesn't work as expected

2017-10-03 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> $ git rm -r --cached . && git add .
>
> (Both should work)
>
> To be honest, from the documentation, I can't figure out the difference 
> between
> $ git read-tree --empty
> and
> $ git rm -r --cached .
>
> Does anybody remember the discussion, why we ended up with read-tree ?

We used to use neither, and considered it fine to "rm .git/index" if
you wanted to empty the on-disk index file in the old world.  In the
modern world, folks want you to avoid touching filesystem directly
and instead want you to use Git tools, and the above are two obvious
ways to do so.

"git read-tree" (without any parameter, i.e. "read these 0 trees and
populate the index with it") and its modern and preferred synonym
"git read-tree --empty" (i.e. "I am giving 0 trees and I know the
sole effect of this command is to empty the index.") are more direct
ways to express "I want the index emptied" between the two.

The other one, "git rm -r --cached .", in the end gives you the same
state because it tells Git to "iterate over all the entries in the
index, find the ones that match pathspec '.', and remove them from
the index.".  It is not wrong per-se, but conceptually it is a bit
roundabout way to say that "I want the index emptied", I would
think.

I wouldn't be surprised if the "rm -r --cached ." were a lot slower,
due to the overhead of having to do the pathspec filtering that ends
up to be a no-op, but there shouldn't be a difference in the end
result.


Re: Line ending normalization doesn't work as expected

2017-10-03 Thread Torsten Bögershausen
On 2017-10-03 19:23, Robert Dailey wrote:
> On Tue, Oct 3, 2017 at 11:26 AM, Torsten Bögershausen  wrote:
>> The short version is, that the instructions on Github are outdated.
>> This is the official procedure (since 2016, Git v2.12 or so)
>> But it should work even with older version of Git.
>>
>> $ echo "* text=auto" >.gitattributes
>> $ git read-tree --empty   # Clean index, force re-scan of working directory
>> $ git add .
>> $ git status# Show files that will be normalized
>> $ git commit -m "Introduce end-of-line normalization"
> 
> Is the way I did it that worked also a valid solution? Or did it only
> work accidentally? Again the command I ran that worked is:
> 
> $ git rm -r --cached . && git add .

(Both should work)

To be honest, from the documentation, I can't figure out the difference between
$ git read-tree --empty
and
$ git rm -r --cached .

Does anybody remember the discussion, why we ended up with read-tree ?


Re: Line ending normalization doesn't work as expected

2017-10-03 Thread Robert Dailey
On Tue, Oct 3, 2017 at 11:26 AM, Torsten Bögershausen  wrote:
> The short version is, that the instructions on Github are outdated.
> This is the official procedure (since 2016, Git v2.12 or so)
> But it should work even with older version of Git.
>
> $ echo "* text=auto" >.gitattributes
> $ git read-tree --empty   # Clean index, force re-scan of working directory
> $ git add .
> $ git status# Show files that will be normalized
> $ git commit -m "Introduce end-of-line normalization"

Is the way I did it that worked also a valid solution? Or did it only
work accidentally? Again the command I ran that worked is:

$ git rm -r --cached . && git add .


Re: Line ending normalization doesn't work as expected

2017-10-03 Thread Torsten Bögershausen
On 2017-10-03 17:00, Robert Dailey wrote:
> I'm on Windows using Git for Windows v2.13.1. Following github's
> recommended process for fixing line endings after adding a
> `.gitattributes` file[1], I run the following:
> 
> $ rm .git/index && git reset
> 
> Once I run `git status`, I see that no files have changed. Note that I
> know for a fact in my repository, files were committed using CRLF line
> endings (the files in question are C# code files, and no
> .gitattributes was present at the time).
> 
> I also tried this:
> 
> $ git rm -r --cached . && git reset --hard
> 
> However, again `git status` shows no working copy modifications. The
> one thing that *did* work (and I tried this on accident actually) is:
> 
> $ git rm -r --cached . && git add .
> 
> This properly showed all files in my index with line ending
> modifications (I ran `git diff --cached -R` to be sure; the output
> shows `^M` at the end of each line in the diff in this case). Note
> that my global git config has `core.autocrlf` set to `false`, but I
> also tried the top 2 commands above with it set to `true` but it made
> no difference.
> 
> So my question is: Why do the top 2 commands not work, but the third
> one does? To me this all feels like magic / nondeterministic, so I'm
> hoping someone here knows what is going on and can explain the logic
> of it. Also if this is a git config issue, I'm not sure what it could
> be. Note my `.gitattributes` just has this in it:

The short version is, that the instructions on Github are outdated.
This is the official procedure (since 2016, Git v2.12 or so)
But it should work even with older version of Git.

$ echo "* text=auto" >.gitattributes
$ git read-tree --empty   # Clean index, force re-scan of working directory
$ git add .
$ git status# Show files that will be normalized
$ git commit -m "Introduce end-of-line normalization"


Could you open an issue on Github ?
(Or can someone @github fix this ?)

> 
> * text=auto
> 
> Thanks in advance.
> 
> 
> [1]: https://help.github.com/articles/dealing-with-line-endings/
>