Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Junio C Hamano
Jeff King  writes:

>> Sorry, Torsten, this is not my doing. So I cannot explain why it is not an
>> enum.
>> 
>> I *guess* the rationale for 'c' being the cmdmode of --textconv is "c for
>> convert". That is why I chose "w as in worktree" as new cmdmode.
>
> I think it started as a 'char' because it was representing the
> single-letter options of "cat-file -e", "cat-file -s", etc. And then the
> textconv patches started the monstrosity of "c".

I think it was merely 't' was taken for "-t" (one beauty of using
the OPT_CMDMODE for options with shorthand is that we do not need
any enum, as the short-form options already form an enum), so
"textConv" was the compromise.  It could have been 'T' or even \C-t
;-)

> I don't think cleaning it up needs to be part of your series, but it
> might be nice low-hanging fruit for somebody to clean up on top.

I agree.

I briefly wondered if we want to add a check to ensure uniqueness of
these cmdmode letters in parse_options_check(), but that would
forbid us from having synonymous options (e.g. "am --continue" and
"am --resolved" both map to the same), so it would not work.

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Jeff King
On Fri, Aug 19, 2016 at 02:25:51PM +0200, Johannes Schindelin wrote:

> > On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> > > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > > index 5ff58b3..5f91cf4 100644
> > > --- a/builtin/cat-file.c
> > > +++ b/builtin/cat-file.c
> > > @@ -17,6 +17,7 @@ struct batch_options {
> > >   int print_contents;
> > >   int buffer_output;
> > >   int all_objects;
> > > + int cmdmode; /* may be 'w' or 'c' for --filters or --textconv */
> > How do I read 'w' and 'c' ?
> > wilter and cextconv ? Does it make sense to use an enum here ?
> > Or a #define ?
> 
> Sorry, Torsten, this is not my doing. So I cannot explain why it is not an
> enum.
> 
> I *guess* the rationale for 'c' being the cmdmode of --textconv is "c for
> convert". That is why I chose "w as in worktree" as new cmdmode.

I think it started as a 'char' because it was representing the
single-letter options of "cat-file -e", "cat-file -s", etc. And then the
textconv patches started the monstrosity of "c".

I don't think cleaning it up needs to be part of your series, but it
might be nice low-hanging fruit for somebody to clean up on top.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Jeff King
On Fri, Aug 19, 2016 at 02:59:32PM +0200, Johannes Schindelin wrote:

> > The only thing that somewhat is worrysome is what would happen to
> > %(rest).
> 
> What would happen is that it would print out the path, as it is exactly
> that "rest" field that is used in the implementation, piggybacking on that
> very nice feature.

Good, that is what I would expect to happen.

> Of course, this means that you simply cannot send "SHA-1 path rest" to
> cat-file --batch and expect that to work. Should anybody need that in the
> future, they can give it a try themselves to patch cat-file.

Right. The real motivation for %(rest) was to accept "git rev-list
--objects" format, so you can do stuff like:

  git rev-list --objects --all |
  git cat-file --batch-check='%(objectsize) %(rest)' |
  sort -rn |
  head

to get the paths of the largest objects. So the fact that "%(rest)" will
be the filename in this case is almost certainly the most useful default
(and if somebody really wants something else, they can try to figure out
other semantics later).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Jeff King
On Fri, Aug 19, 2016 at 03:09:19PM +0200, Johannes Schindelin wrote:

> > The object name can have spaces in it, too. E.g.:
> > 
> >   HEAD:path with spaces
> > 
> > or even:
> > 
> >   :/grep for this
> > 
> > (as was pointed out to me when I tried to turn on %(rest) handling by
> > default, long ago). How do those work with your patch?
> 
> They don't ;-)
> 
> And quite frankly, the documentation should make it clear to users that
> batch mode with --filters or --textconv won't work when the object name
> contains white space: it says that the object name is split from the path
> at the first white space character.

I think that is an obvious implication of the new documentation. I was
just concerned with a regression from the previous behavior.

> > It looks like the extra split isn't enabled unless one of those options
> > is selected. Since --filters is new, that's OK for backwards
> > compatibility. But --textconv isn't.
> 
> Except that it is okay, because --textconv *was not even supported in
> batch mode*. So there is no backwards compatibility that could be broken.

Ah, OK. I thought we handled "HEAD:path with spaces" there, but I see
that you cannot even specify "--textconv" with "--batch", because it
complains of the cmdmode.

So the new rule becomes "we split if we see %(rest), or --textconv, or
--filter", which is reasonable.

> Fixing %(rest) for object names containing spaces is distinctly outside
> the original intent, and certainly outside of my use case.

Yeah, I agree it is not necessary for this series (I was only
considering it as an option to fix the regression I now see doesn't
exist).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Johannes Schindelin
Hi Junio,

On Thu, 18 Aug 2016, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> >
> >> With this patch, --batch can be combined with --textconv or --filters.
> >> For this to work, the input needs to have the form
> >> 
> >>
> >> 
> >> so that the filters can be chosen appropriately.
> >
> > The object name can have spaces in it, too. E.g.:
> >
> >   HEAD:path with spaces
> >
> > or even:
> >
> >   :/grep for this
> 
> When I wrote my review, I didn't consider this use case.
> 
> There is no -z format in --batch, which is unfortunate.  If we had
> one, it would trivially make it possible to do so, and we can even
> have paths with LF in them ;-).  On the other hand, producing a NUL
> separated input is a chore.

I think it would make for a fine little project to add support for --batch
-z.

Just not in this here patch series.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Johannes Schindelin
Hi Peff,

On Thu, 18 Aug 2016, Jeff King wrote:

> On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> 
> > With this patch, --batch can be combined with --textconv or --filters.
> > For this to work, the input needs to have the form
> > 
> > 
> > 
> > so that the filters can be chosen appropriately.
> 
> The object name can have spaces in it, too. E.g.:
> 
>   HEAD:path with spaces
> 
> or even:
> 
>   :/grep for this
> 
> (as was pointed out to me when I tried to turn on %(rest) handling by
> default, long ago). How do those work with your patch?

They don't ;-)

And quite frankly, the documentation should make it clear to users that
batch mode with --filters or --textconv won't work when the object name
contains white space: it says that the object name is split from the path
at the first white space character.

> It looks like the extra split isn't enabled unless one of those options
> is selected. Since --filters is new, that's OK for backwards
> compatibility. But --textconv isn't.

Except that it is okay, because --textconv *was not even supported in
batch mode*. So there is no backwards compatibility that could be broken.

> I wonder if we need an option specifically to say "the object name can
> be split". Right now it kicks in automatically if you use "%(rest)" in
> your format, but you might not care about passing along that output
> (e.g., a lot of times I am piping "rev-list" straight to cat-file, and I
> have to use a separate "cut" to throw away the pathnames).

As I said elsewhere in this thread: if anybody encounters that problem,
they are welcome to fix it themselves.

My patch series purpose is to introduce --filters and make it compatible
with the batch mode. As I imitated --textconv, it was both easy and
correct to make it compatible with the batch mode, too. But that is the
extent of this here patch series.

Fixing %(rest) for object names containing spaces is distinctly outside
the original intent, and certainly outside of my use case.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Johannes Schindelin
Hi Junio,

On Thu, 18 Aug 2016, Junio C Hamano wrote:

> Johannes Schindelin  writes:
> 
> > With this patch, --batch can be combined with --textconv or --filters.
> > For this to work, the input needs to have the form
> >
> > 
> >
> > so that the filters can be chosen appropriately.
> >  --batch::
> >  --batch=::
> > Print object information and contents for each object provided
> > -   on stdin.  May not be combined with any other options or arguments.
> > -   See the section `BATCH OUTPUT` below for details.
> > +   on stdin.  May not be combined with any other options or arguments
> > +   except `--textconv` or `--filters`, in which case the input lines
> > +   also need to specify the path, separated by white space.  See the
> > +   section `BATCH OUTPUT` below for details.
> 
> Makes sense.  This format still allows
> 
>   HEAD: 
> 
> i.e. the object name is taken from path2 but we forget it and
> pretend that the blob sits at path2, which is a good feature.

Correct.

> If I am not mistaken, your cover letter alluded that it might be
> ideal to take "HEAD:" (nothing else) as input, grab ""
> and feed that to the filtering machinery, but you decided to stop
> short of doing that.  I actually think that it is the right thing to
> do for this feature to ignore the trailing : in the object
> name, iow, I agree with the result from the feature design POV.

I actually decided against it only because it would uglify the code. From
a user point of view, I am a bit annoyed having to say

:README README
:Makefile Makefile
:main.c main.c
[...]

> The only thing that somewhat is worrysome is what would happen to
> %(rest).

What would happen is that it would print out the path, as it is exactly
that "rest" field that is used in the implementation, piggybacking on that
very nice feature.

Of course, this means that you simply cannot send "SHA-1 path rest" to
cat-file --batch and expect that to work. Should anybody need that in the
future, they can give it a try themselves to patch cat-file.

Ciao,
Dscho
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-19 Thread Johannes Schindelin
Hi Torsten,

On Thu, 18 Aug 2016, Torsten Bögershausen wrote:

> On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> > diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> > index 5ff58b3..5f91cf4 100644
> > --- a/builtin/cat-file.c
> > +++ b/builtin/cat-file.c
> > @@ -17,6 +17,7 @@ struct batch_options {
> > int print_contents;
> > int buffer_output;
> > int all_objects;
> > +   int cmdmode; /* may be 'w' or 'c' for --filters or --textconv */
> How do I read 'w' and 'c' ?
> wilter and cextconv ? Does it make sense to use an enum here ?
> Or a #define ?

Sorry, Torsten, this is not my doing. So I cannot explain why it is not an
enum.

I *guess* the rationale for 'c' being the cmdmode of --textconv is "c for
convert". That is why I chose "w as in worktree" as new cmdmode.

Ciao,
Dscho

Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-18 Thread Torsten Bögershausen
On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
> With this patch, --batch can be combined with --textconv or --filters.
> For this to work, the input needs to have the form
> 
>   
> 
> so that the filters can be chosen appropriately.
> 
> Signed-off-by: Johannes Schindelin 
> ---
>  Documentation/git-cat-file.txt | 18 +++-
>  builtin/cat-file.c | 49 
> +-
>  t/t8010-cat-file-filters.sh| 10 +
>  3 files changed, 67 insertions(+), 10 deletions(-)
> 
> diff --git a/Documentation/git-cat-file.txt b/Documentation/git-cat-file.txt
> index 59a3c37..1f4d954 100644
> --- a/Documentation/git-cat-file.txt
> +++ b/Documentation/git-cat-file.txt
> @@ -10,7 +10,7 @@ SYNOPSIS
>  
>  [verse]
>  'git cat-file' (-t [--allow-unknown-type]| -s [--allow-unknown-type]| -e | 
> -p |  | --textconv | --filters ) [--use-path=] 
> -'git cat-file' (--batch | --batch-check) [--follow-symlinks]
> +'git cat-file' (--batch | --batch-check) [ --textconv | --filters ] 
> [--follow-symlinks]
>  
>  DESCRIPTION
>  ---
> @@ -20,7 +20,11 @@ object type, or `-s` is used to find the object size, or 
> `--textconv` or
>  `--filters` is used (which imply type "blob").
>  
>  In the second form, a list of objects (separated by linefeeds) is provided on
> -stdin, and the SHA-1, type, and size of each object is printed on stdout.
> +stdin, and the SHA-1, type, and size of each object is printed on stdout. The
> +output format can be overridden using the optional `` argument. If
> +either `--textconv` or `--filters` was specified, the input is expected to
> +list the object names followed by the path name, separated by a single white
> +space, so that the appropriate drivers can be determined.
>  
>  OPTIONS
>  ---
> @@ -72,13 +76,17 @@ OPTIONS
>  --batch::
>  --batch=::
>   Print object information and contents for each object provided
> - on stdin.  May not be combined with any other options or arguments.
> - See the section `BATCH OUTPUT` below for details.
> + on stdin.  May not be combined with any other options or arguments
> + except `--textconv` or `--filters`, in which case the input lines
> + also need to specify the path, separated by white space.  See the
> + section `BATCH OUTPUT` below for details.
>  
>  --batch-check::
>  --batch-check=::
>   Print object information for each object provided on stdin.  May
> - not be combined with any other options or arguments.  See the
> + not be combined with any other options or arguments except
> + `--textconv` or `--filters`, in which case the input lines also
> + need to specify the path, separated by white space.  See the
>   section `BATCH OUTPUT` below for details.
>  
>  --batch-all-objects::
> diff --git a/builtin/cat-file.c b/builtin/cat-file.c
> index 5ff58b3..5f91cf4 100644
> --- a/builtin/cat-file.c
> +++ b/builtin/cat-file.c
> @@ -17,6 +17,7 @@ struct batch_options {
>   int print_contents;
>   int buffer_output;
>   int all_objects;
> + int cmdmode; /* may be 'w' or 'c' for --filters or --textconv */
How do I read 'w' and 'c' ?
wilter and cextconv ? Does it make sense to use an enum here ?
Or a #define ?

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-18 Thread Jeff King
On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:

> With this patch, --batch can be combined with --textconv or --filters.
> For this to work, the input needs to have the form
> 
>   
> 
> so that the filters can be chosen appropriately.

The object name can have spaces in it, too. E.g.:

  HEAD:path with spaces

or even:

  :/grep for this

(as was pointed out to me when I tried to turn on %(rest) handling by
default, long ago). How do those work with your patch?

It looks like the extra split isn't enabled unless one of those options
is selected. Since --filters is new, that's OK for backwards
compatibility. But --textconv isn't. So I think:

  echo "HEAD:path with spaces" |
  git cat-file --textconv --batch

is regressed by this patch.

I wonder if we need an option specifically to say "the object name can
be split". Right now it kicks in automatically if you use "%(rest)" in
your format, but you might not care about passing along that output
(e.g., a lot of times I am piping "rev-list" straight to cat-file, and I
have to use a separate "cut" to throw away the pathnames).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-18 Thread Junio C Hamano
Johannes Schindelin  writes:

> With this patch, --batch can be combined with --textconv or --filters.
> For this to work, the input needs to have the form
>
>   
>
> so that the filters can be chosen appropriately.
>  --batch::
>  --batch=::
>   Print object information and contents for each object provided
> - on stdin.  May not be combined with any other options or arguments.
> - See the section `BATCH OUTPUT` below for details.
> + on stdin.  May not be combined with any other options or arguments
> + except `--textconv` or `--filters`, in which case the input lines
> + also need to specify the path, separated by white space.  See the
> + section `BATCH OUTPUT` below for details.

Makes sense.  This format still allows

HEAD: 

i.e. the object name is taken from path2 but we forget it and
pretend that the blob sits at path2, which is a good feature.

If I am not mistaken, your cover letter alluded that it might be
ideal to take "HEAD:" (nothing else) as input, grab ""
and feed that to the filtering machinery, but you decided to stop
short of doing that.  I actually think that it is the right thing to
do for this feature to ignore the trailing : in the object
name, iow, I agree with the result from the feature design POV.

The only thing that somewhat is worrysome is what would happen to
%(rest).  I guess [*1*] it is OK to declare that you cannot use %(rest) in
your output format when --filter/--textconv is in use, but if that
is the direction we are going, that limitation needs to be
documented.


[Footnote]

*1* This is just a "guess", because I do not know what people are
using %(rest) for.  It is plausible that their use case do not need
--filter/--textconv at all, and if that is the case, having this
limitation is perfectly fine.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 4/4] cat-file: support --textconv/--filters in batch mode

2016-08-18 Thread Junio C Hamano
Jeff King  writes:

> On Thu, Aug 18, 2016 at 02:46:28PM +0200, Johannes Schindelin wrote:
>
>> With this patch, --batch can be combined with --textconv or --filters.
>> For this to work, the input needs to have the form
>> 
>>  
>> 
>> so that the filters can be chosen appropriately.
>
> The object name can have spaces in it, too. E.g.:
>
>   HEAD:path with spaces
>
> or even:
>
>   :/grep for this

When I wrote my review, I didn't consider this use case.

There is no -z format in --batch, which is unfortunate.  If we had
one, it would trivially make it possible to do so, and we can even
have paths with LF in them ;-).  On the other hand, producing a NUL
separated input is a chore.

Perhaps a new and separate option that is similar to "--batch" but
lacks support for %(rest) and accepts _ONLY_ 40-hex as object name
is the best we can do, then?

> (as was pointed out to me when I tried to turn on %(rest) handling by
> default, long ago). How do those work with your patch?
>
> It looks like the extra split isn't enabled unless one of those options
> is selected. Since --filters is new, that's OK for backwards
> compatibility. But --textconv isn't. So I think:
>
>   echo "HEAD:path with spaces" |
>   git cat-file --textconv --batch
>
> is regressed by this patch.
>
> I wonder if we need an option specifically to say "the object name can
> be split". Right now it kicks in automatically if you use "%(rest)" in
> your format, but you might not care about passing along that output
> (e.g., a lot of times I am piping "rev-list" straight to cat-file, and I
> have to use a separate "cut" to throw away the pathnames).
>
> -Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html