Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-24 Thread Yuki Kokubun
>> "git filter-branch -- --all" print unwanted error messages when refs that
>> cannot be used with ^0 exist.
>
> It is not incorrect per-se, but if I were writing this, I'd say
> "... when refs that point at objects that are not committish" or
> something like that, as that is much closer to people (both end
> users and Git developers) than the "^0" implementation detail.

I agree that readability to users is important than the implementation detail.
So, I'm gonna change the commit message.

>> Such refs can be created by "git replace" with
>> trees or blobs. Also, "git tag" with trees or blobs can create such refs.
> 
> Taking two paragraphs above together, you wrote an excellent
> description of the problem to be solved (i.e. what would be seen by
> users and under what condition the symptom would trigger).  If there
> is a single obvious solution that would trivially follow from the
> problem description, it is perfectly fine to stop here.  Otherwise,
> it would help to describe how it is solved next.  Something as brief
> like
> 
>   Filter these problematic refs out early, and warn that these
>   refs are left unwritten while doing so.
> 
> would suffice in this case, I think.  We _could_ insert
> 
>   before they are seen by the logic to see which refs have
>   been modified and which have been left intact (which is
>   where the unwanted error messages come from),
> 
> between "early," and "and warn", if we wanted to.

I think the detailed description is better than the shorter one in this case.
So I'm gonna follow to detailed one.


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Yuki Kokubun
> However, if we pre-filter to limit the refs in "$tempdir/heads" to
> those that are committish (i.e. those that pass "$ref^0") like the
> patch and subsequent discussion suggests, wouldn't we lose the
> warning for these replace refs and non-committish tags.  We perhaps
> could do something like:
> 
>   git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit
> 
>   while read ref
>   do
>   case "$ref" in ^?*) continue ;; esac
>   if git rev-parse --verify "$ref^0" 2>/dev/null
> then
>   echo "$ref"
>   else
>   warn "WARNING: not rewriting '$ref' (not a committish)"
>   fi
>   done >"$tempdir/heads" <"$tempdir/raw-heads"
> 
> (note: the else clause is new, relative to my earlier suggestion).

I agree these suggestions.
I'm gonna send a new patch that follow it.


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Junio C Hamano
Yuki Kokubun  writes:

>> Yuki Kokubun  writes:
>> 
>> >> Yuki Kokubun  writes:
>> >> 
>> >> > "git filter-branch -- --all" can be confused when refs that refer to 
>> >> > objects
>> >> > other than commits or tags exists.
>> ...
>
> I meant the confusion is abnormal messages from the output of "git 
> filter-branch -- --all".

OK, so it is not that the program logic gets confused and ends up
performing a wrong rewrite, but mostly that it gives confusing
messages.

> For example, this is an output of "git filter-branch -- --all":
>
> Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, 
> remaining 0 predicted)
> WARNING: Ref 'refs/heads/master' is unchanged
> WARNING: Ref 'refs/heads/no-newline' is unchanged
> WARNING: Ref 'refs/heads/original' is unchanged

These are worth keeping, as I think existing users expect to see
them.

> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> fatal: ambiguous argument 
> 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision 
> or path not in the working tree.
> Use '--' to separate paths from revisions, like this:
> 'git  [...] -- [...]'
> WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is 
> unchanged
> WARNING: Ref 'refs/tags/add-file' is unchanged
> WARNING: Ref 'refs/tags/file' is unchanged
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
> fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not 
> in the working tree.
> Use '--' to separate paths from revisions, like this:
> 'git  [...] -- [...]'
> WARNING: Ref 'refs/tags/treetag' is unchanged

I think these warning messages should be kept, especially if we are
to keep the warning messages for the unchanged branches.  However,
the internal error messages are unwanted--these are implementation
details that reach the conclusion, i.e. the ref we were asked to
rewrite ended up being unchanged hence we did not touch it.

However, if we pre-filter to limit the refs in "$tempdir/heads" to
those that are committish (i.e. those that pass "$ref^0") like the
patch and subsequent discussion suggests, wouldn't we lose the
warning for these replace refs and non-committish tags.  We perhaps
could do something like:

git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit

while read ref
do
case "$ref" in ^?*) continue ;; esac
if git rev-parse --verify "$ref^0" 2>/dev/null
then
echo "$ref"
else
warn "WARNING: not rewriting '$ref' (not a committish)"
fi
done >"$tempdir/heads" <"$tempdir/raw-heads"

(note: the else clause is new, relative to my earlier suggestion).


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-22 Thread Yuki Kokubun
> Yuki Kokubun  writes:
> 
> >> Yuki Kokubun  writes:
> >> 
> >> > "git filter-branch -- --all" can be confused when refs that refer to 
> >> > objects
> >> > other than commits or tags exists.
> >> > Because "git rev-parse --all" that is internally used can return refs 
> >> > that
> >> > refer to an object other than commit or tag. But it is not considered in 
> >> > the
> >> > phase of updating refs.
> >> 
> >> Could you describe what the consequence of that is?  We have a ref
> >> that points directly at a blob object, or a ref that points at a tag
> >> object that points at a blob object.  The current code leaves both of
> >> these refs in "$tempdir/heads".  Then...?
> >
> > Sorry, this is my wrong.
> > I wrongly thought only refs/replace can point at a blob or tree object.
> 
> No need to be sorry.  You still need to describe what (bad things)
> happen if we do not filter out refs that do not point at committish
> in the proposed log message.  
> 
> IOW, can you elaborate and clarify your "can be confused" at the
> beginning?

I meant the confusion is abnormal messages from the output of "git 
filter-branch -- --all".
For example, this is an output of "git filter-branch -- --all":

Rewrite bcdbd016c77df3d5641a3cf820b2ed46ba7bf3b4 (5/5) (0 seconds passed, 
remaining 0 predicted)
WARNING: Ref 'refs/heads/master' is unchanged
WARNING: Ref 'refs/heads/no-newline' is unchanged
WARNING: Ref 'refs/heads/original' is unchanged
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
fatal: ambiguous argument 
'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9^0': unknown revision or 
path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
WARNING: Ref 'refs/replace/8a2016f3730cad8309c110f819c855403ed0a5b9' is 
unchanged
WARNING: Ref 'refs/tags/add-file' is unchanged
WARNING: Ref 'refs/tags/file' is unchanged
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
error: object 1bf53b49c26465454e4ac377f2ed3f91bb1d6ac1 is a tree, not a commit
fatal: ambiguous argument 'refs/tags/treetag^0': unknown revision or path not 
in the working tree.
Use '--' to separate paths from revisions, like this:
'git  [...] -- [...]'
WARNING: Ref 'refs/tags/treetag' is unchanged

You can see a lot of terrible messages such as "error" and "fatal".
But on the whole, the result of "git filter-branch -- --all" is not so abnormal.
So, this is a just problem about abonormal messages.

I think this messages should be suppressed.
How do you feel about it?


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Junio C Hamano
Yuki Kokubun  writes:

>> Yuki Kokubun  writes:
>> 
>> > "git filter-branch -- --all" can be confused when refs that refer to 
>> > objects
>> > other than commits or tags exists.
>> > Because "git rev-parse --all" that is internally used can return refs that
>> > refer to an object other than commit or tag. But it is not considered in 
>> > the
>> > phase of updating refs.
>> 
>> Could you describe what the consequence of that is?  We have a ref
>> that points directly at a blob object, or a ref that points at a tag
>> object that points at a blob object.  The current code leaves both of
>> these refs in "$tempdir/heads".  Then...?
>
> Sorry, this is my wrong.
> I wrongly thought only refs/replace can point at a blob or tree object.

No need to be sorry.  You still need to describe what (bad things)
happen if we do not filter out refs that do not point at committish
in the proposed log message.  

IOW, can you elaborate and clarify your "can be confused" at the
beginning?


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Yuki Kokubun
> Yuki Kokubun  writes:
> 
> > "git filter-branch -- --all" can be confused when refs that refer to objects
> > other than commits or tags exists.
> > Because "git rev-parse --all" that is internally used can return refs that
> > refer to an object other than commit or tag. But it is not considered in the
> > phase of updating refs.
> 
> Could you describe what the consequence of that is?  We have a ref
> that points directly at a blob object, or a ref that points at a tag
> object that points at a blob object.  The current code leaves both of
> these refs in "$tempdir/heads".  Then...?

Sorry, this is my wrong.
I wrongly thought only refs/replace can point at a blob or tree object.

> 
>   ... goes and looks ...
> 
> There is a loop that looks like this:
> 
>   while read ref
>   do
>   sha1=$(git rev-parse "$ref^0")
>   ...
>   done <"$tempdir/heads"
> 
> which would break on anything but a commit-ish.
> 
> >  # The refs should be updated if their heads were rewritten
> >  git rev-parse --no-flags --revs-only --symbolic-full-name \
> > -   --default HEAD "$@" > "$tempdir"/raw-heads || exit
> > +   --default HEAD "$@" > "$tempdir"/raw-objects || exit
> > +# refs/replace can refer to an object other than commit or tag
> 
> Mention of replace refs in the proposed log message gives an easy to
> understand example and is a good idea, but this in code comment does
> not have to single out the replace refs.  A tag can also point at an
> object with any type, e.g. "git tag v2.6.11-tree v2.6.11^{tree}"
> would make "refs/tags/v2.6.11-tree" point at the tree at the top
> level of the tree-ish "v2.6.11".  It probably is OK to drop this
> comment altogether.

OK, I'm gonna drop the incorrect comment.

> 
> > +while read ref
> > +do
> > +   type=$(git cat-file -t "$ref")
> > +   if test $type = commit || test $type = tag
> > +   then
> > +   echo "$ref"
> > +   fi
> > +done >"$tempdir"/raw-heads <"$tempdir"/raw-objects
> >  sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads
> 
> So... is the idea to limit the set of refs to be rewritten to those
> that point at commits and tags?  As I already alluded to, I do not
> think you want to accept a ref that points at any tag object---only
> the ones that point at a tag that points at a commit-ish, so that
> the code will not barf when doing "$ref^0".
> 
> So perhaps
> 
>   git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit
> 
>   while read ref
>   do
>   case "$ref" in ^?*) continue ;; esac
>   if git rev-parse --verify "$ref^0" 2>/dev/null
> then
>   echo "$ref"
>   fi
>   done >"$tempdir/heads" <"$tempdir/raw-heads"
> 
> or something?  Note that you do not need the "sed" as the loop
> already excludes the negative revs.

I feel using "git rev-parse --verify" is a good way as you said.
I'm gonna modify the patch to use it.

> 
> >  test -s "$tempdir"/heads ||
> > diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
> > index 7cb60799b..efeaf5887 100755
> > --- a/t/t7003-filter-branch.sh
> > +++ b/t/t7003-filter-branch.sh
> > @@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object 
> > name vs pathname ambiguity' '
> > git show HEAD:$ambiguous
> >  '
> >  
> > +test_expect_success 'rewrite repository including refs/replace that point 
> > to non commit object' '
> > +   test_when_finished "git reset --hard original" &&
> > +   tree=$(git rev-parse HEAD^{tree}) &&
> > +   test_when_finished "git replace -d $tree" &&
> > +   echo A >new &&
> > +   git add new &&
> > +   new_tree=$(git write-tree) &&
> > +   git replace $tree $new_tree &&
> 
> Perhaps something like this here:
> 
>   git tag -a "tag to a tree" treetag $new_tree &&
> 
> can tell su how well it works with a tag that points at a tree?

Sounds good. I'm gonna add such tags to the test case.

> 
> > +   git reset --hard HEAD &&
> > +   git filter-branch -f -- --all >filter-output 2>&1 &&
> > +   ! fgrep fatal filter-output
> > +'
> > +
> >  test_done


Re: [PATCH] filter-branch: consider refs can refer to an object other than commit or tag

2018-03-21 Thread Junio C Hamano
Yuki Kokubun  writes:

> "git filter-branch -- --all" can be confused when refs that refer to objects
> other than commits or tags exists.
> Because "git rev-parse --all" that is internally used can return refs that
> refer to an object other than commit or tag. But it is not considered in the
> phase of updating refs.

Could you describe what the consequence of that is?  We have a ref
that points directly at a blob object, or a ref that points at a tag
object that points at a blob object.  The current code leaves both of
these refs in "$tempdir/heads".  Then...?

... goes and looks ...

There is a loop that looks like this:

while read ref
do
sha1=$(git rev-parse "$ref^0")
...
done <"$tempdir/heads"

which would break on anything but a commit-ish.

>  # The refs should be updated if their heads were rewritten
>  git rev-parse --no-flags --revs-only --symbolic-full-name \
> - --default HEAD "$@" > "$tempdir"/raw-heads || exit
> + --default HEAD "$@" > "$tempdir"/raw-objects || exit
> +# refs/replace can refer to an object other than commit or tag

Mention of replace refs in the proposed log message gives an easy to
understand example and is a good idea, but this in code comment does
not have to single out the replace refs.  A tag can also point at an
object with any type, e.g. "git tag v2.6.11-tree v2.6.11^{tree}"
would make "refs/tags/v2.6.11-tree" point at the tree at the top
level of the tree-ish "v2.6.11".  It probably is OK to drop this
comment altogether.

> +while read ref
> +do
> + type=$(git cat-file -t "$ref")
> + if test $type = commit || test $type = tag
> + then
> + echo "$ref"
> + fi
> +done >"$tempdir"/raw-heads <"$tempdir"/raw-objects
>  sed -e '/^^/d' "$tempdir"/raw-heads >"$tempdir"/heads

So... is the idea to limit the set of refs to be rewritten to those
that point at commits and tags?  As I already alluded to, I do not
think you want to accept a ref that points at any tag object---only
the ones that point at a tag that points at a commit-ish, so that
the code will not barf when doing "$ref^0".

So perhaps

git rev-parse --no-flags ... >"$tempdir/raw-heads" || exit

while read ref
do
case "$ref" in ^?*) continue ;; esac
if git rev-parse --verify "$ref^0" 2>/dev/null
then
echo "$ref"
fi
done >"$tempdir/heads" <"$tempdir/raw-heads"

or something?  Note that you do not need the "sed" as the loop
already excludes the negative revs.

>  test -s "$tempdir"/heads ||
> diff --git a/t/t7003-filter-branch.sh b/t/t7003-filter-branch.sh
> index 7cb60799b..efeaf5887 100755
> --- a/t/t7003-filter-branch.sh
> +++ b/t/t7003-filter-branch.sh
> @@ -470,4 +470,17 @@ test_expect_success 'tree-filter deals with object name 
> vs pathname ambiguity' '
>   git show HEAD:$ambiguous
>  '
>  
> +test_expect_success 'rewrite repository including refs/replace that point to 
> non commit object' '
> + test_when_finished "git reset --hard original" &&
> + tree=$(git rev-parse HEAD^{tree}) &&
> + test_when_finished "git replace -d $tree" &&
> + echo A >new &&
> + git add new &&
> + new_tree=$(git write-tree) &&
> + git replace $tree $new_tree &&

Perhaps something like this here:

git tag -a "tag to a tree" treetag $new_tree &&

can tell su how well it works with a tag that points at a tree?

> + git reset --hard HEAD &&
> + git filter-branch -f -- --all >filter-output 2>&1 &&
> + ! fgrep fatal filter-output
> +'
> +
>  test_done