Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-07 Thread Michael J Gruber
Junio C Hamano venit, vidit, dixit 06.02.2013 17:53:
 Michael J Gruber g...@drmicha.warpmail.net writes:
 
 Currently, diff and cat-file for blobs obey --textconv options
 (with the former defaulting to --textconv and the latter to
 --no-textconv) whereas show does not obey this option, even though
 it takes diff options.

 Make show on blobs behave like diff, i.e. obey --textconv by
 default and --no-textconv when given.
 
 What does log -p do currently, and what should it do?  Does/should
 it also use --textconv?

It invokes --textconv by default, and allows to override with
--no-textconv. That's what I meant to say but seem to have failed to.

I think at some point we decided to have human output on by default
for all things diff (porcelain only, of course) unless the diff is meant
for machine consumption, such as in the case of format-patch.

 The --textconv is a natural extension of what --ext-diff provides us,
 so I think it should trigger the same way as how --ext-diff triggers.

This series' aim is to make the textconv behavior the same for all
human output commands. I don't see textconv and ext-diff to be
strongly related, and if then the other way round:

textconv is about converting blobs to a (possibly lossy) human
consumable text.

ext-diff is about computing diffs with an external diff tool (not in
the sense of diff-tool).

One way of diffing is textconving blobs then using internal diff on the
resulting text blobs, but ext-diff is more general and, really,
orthogonal in a way. (It used to be used often for what textconv does
when that didn't exist.)

 We apply --ext-diff for diff by default but not for log -p and
 show; I suspect this may have been for a good reason but I do not
 recall the discussion that led to the current behaviour offhand.

I don't think I changed anything ext-diff related, did I? 1/4 is about
showing blobs, not diffs.

 Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net
 ---
  builtin/log.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

 diff --git a/builtin/log.c b/builtin/log.c
 index 8f0b2e8..f83870d 100644
 --- a/builtin/log.c
 +++ b/builtin/log.c
 @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct 
 rev_info *rev)
  strbuf_release(out);
  }
  
 -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
 +static int show_blob_object(const unsigned char *sha1, struct rev_info 
 *rev, const char *obj_name)
  {
 +unsigned char sha1c[20];
 +struct object_context obj_context;
 +char *buf;
 +unsigned long size;
 +
  fflush(stdout);
 -return stream_blob_to_fd(1, sha1, NULL, 0);
 +if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV))
 +return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 +if (get_sha1_with_context(obj_name, 0, sha1c, obj_context))
 +die(Not a valid object name %s, obj_name);
 +if (!obj_context.path[0] ||
 +!textconv_object(obj_context.path, obj_context.mode, sha1c, 1, 
 buf, size))
 +return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 +if (!buf)
 +die(git show %s: bad file, obj_name);
 +
 +write_or_die(1, buf, size);
 +return 0;
  }
  
  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
 @@ -491,7 +509,7 @@ int cmd_show(int argc, const char **argv, const char 
 *prefix)
  const char *name = objects[i].name;
  switch (o-type) {
  case OBJ_BLOB:
 -ret = show_blob_object(o-sha1, NULL);
 +ret = show_blob_object(o-sha1, rev, name);
  break;
  case OBJ_TAG: {
  struct tag *t = (struct tag *)o;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-07 Thread Michael J Gruber
Jeff King venit, vidit, dixit 06.02.2013 23:06:
 On Wed, Feb 06, 2013 at 04:08:50PM +0100, Michael J Gruber wrote:
 
 diff --git a/builtin/log.c b/builtin/log.c
 index 8f0b2e8..f83870d 100644
 --- a/builtin/log.c
 +++ b/builtin/log.c
 @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct 
 rev_info *rev)
  strbuf_release(out);
  }
  
 -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
 +static int show_blob_object(const unsigned char *sha1, struct rev_info 
 *rev, const char *obj_name)
 
 Should this maybe just take the whole object_array_entry as a cleanup?

It's just a question of one or two/three pointers (I can't count), but
yes, that would be possible.

  {
 +unsigned char sha1c[20];
 +struct object_context obj_context;
 +char *buf;
 +unsigned long size;
 +
  fflush(stdout);
 -return stream_blob_to_fd(1, sha1, NULL, 0);
 +if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV))
 +return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 +if (get_sha1_with_context(obj_name, 0, sha1c, obj_context))
 +die(Not a valid object name %s, obj_name);
 
 It seems a little hacky that we have to look up the sha1 again. What
 should happen in the off chance that hashcmp(sha1, sha1c) != 0 due to
 a race with a simultaneous update of a ref?

I thought about a check here but didn't bother to because I knew the
refactoring would come up again...

 Would it be better if object_array_entry replaced its mode member with
 an object_context?

Do all callers/users want to deal with object_context?

I'm wondering why o_c has a mode at all, since it is mostly used in
conjunction with an object, isn't it?

 The only downside I see is that we might waste a
 significant amount of memory (each context has a PATH_MAX buffer in it).

That's why I used a reference to the struct, see my other reply.

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-07 Thread Jeff King
On Thu, Feb 07, 2013 at 10:05:26AM +0100, Michael J Gruber wrote:

  Would it be better if object_array_entry replaced its mode member with
  an object_context?
 
 Do all callers/users want to deal with object_context?

Wouldn't it just mean replacing entry-mode with entry-oc.mode at
each user?

 I'm wondering why o_c has a mode at all, since it is mostly used in
 conjunction with an object, isn't it?

Just as we record the path from the surrounding tree, we record the
mode. It's that mode which gets put into the pending object list by the
revision parser (see the very end of handle_revision_arg). Storing an
object_context instead of the mode would be a strict superset of what we
store now (right now we just throw the rest away).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-07 Thread Michael J Gruber
Jeff King venit, vidit, dixit 07.02.2013 10:11:
 On Thu, Feb 07, 2013 at 10:05:26AM +0100, Michael J Gruber wrote:
 
 Would it be better if object_array_entry replaced its mode member with
 an object_context?

 Do all callers/users want to deal with object_context?
 
 Wouldn't it just mean replacing entry-mode with entry-oc.mode at
 each user?

Yes, I meant at the time of creation, i.e. when someone has to create
and pass an o_a_e and maybe only knows a mode, and thus would have to
set the path to NULL or .

 I'm wondering why o_c has a mode at all, since it is mostly used in
 conjunction with an object, isn't it?
 
 Just as we record the path from the surrounding tree, we record the
 mode. It's that mode which gets put into the pending object list by the
 revision parser (see the very end of handle_revision_arg). Storing an
 object_context instead of the mode would be a strict superset of what we
 store now (right now we just throw the rest away).

Sure. But why does object_context have a mode member at all? Maybe it is
not alway used together with another struct which has the mode already,
then that's a reason.

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-07 Thread Jeff King
On Thu, Feb 07, 2013 at 10:34:26AM +0100, Michael J Gruber wrote:

  Just as we record the path from the surrounding tree, we record the
  mode. It's that mode which gets put into the pending object list by the
  revision parser (see the very end of handle_revision_arg). Storing an
  object_context instead of the mode would be a strict superset of what we
  store now (right now we just throw the rest away).
 
 Sure. But why does object_context have a mode member at all? Maybe it is
 not alway used together with another struct which has the mode already,
 then that's a reason.

Exactly. It's purely for pulling information out of
get_sha1_with_context, and does not know that you are going to put its
output into an object_array_entry (and many call sites do not).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Junio C Hamano
Michael J Gruber g...@drmicha.warpmail.net writes:

 Currently, diff and cat-file for blobs obey --textconv options
 (with the former defaulting to --textconv and the latter to
 --no-textconv) whereas show does not obey this option, even though
 it takes diff options.

 Make show on blobs behave like diff, i.e. obey --textconv by
 default and --no-textconv when given.

What does log -p do currently, and what should it do?  Does/should
it also use --textconv?

The --textconv is a natural extension of what --ext-diff provides us,
so I think it should trigger the same way as how --ext-diff triggers.

We apply --ext-diff for diff by default but not for log -p and
show; I suspect this may have been for a good reason but I do not
recall the discussion that led to the current behaviour offhand.

 Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net
 ---
  builtin/log.c | 24 +---
  1 file changed, 21 insertions(+), 3 deletions(-)

 diff --git a/builtin/log.c b/builtin/log.c
 index 8f0b2e8..f83870d 100644
 --- a/builtin/log.c
 +++ b/builtin/log.c
 @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct 
 rev_info *rev)
   strbuf_release(out);
  }
  
 -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
 +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, 
 const char *obj_name)
  {
 + unsigned char sha1c[20];
 + struct object_context obj_context;
 + char *buf;
 + unsigned long size;
 +
   fflush(stdout);
 - return stream_blob_to_fd(1, sha1, NULL, 0);
 + if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV))
 + return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 + if (get_sha1_with_context(obj_name, 0, sha1c, obj_context))
 + die(Not a valid object name %s, obj_name);
 + if (!obj_context.path[0] ||
 + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, 
 buf, size))
 + return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 + if (!buf)
 + die(git show %s: bad file, obj_name);
 +
 + write_or_die(1, buf, size);
 + return 0;
  }
  
  static int show_tag_object(const unsigned char *sha1, struct rev_info *rev)
 @@ -491,7 +509,7 @@ int cmd_show(int argc, const char **argv, const char 
 *prefix)
   const char *name = objects[i].name;
   switch (o-type) {
   case OBJ_BLOB:
 - ret = show_blob_object(o-sha1, NULL);
 + ret = show_blob_object(o-sha1, rev, name);
   break;
   case OBJ_TAG: {
   struct tag *t = (struct tag *)o;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Jeff King
On Wed, Feb 06, 2013 at 04:08:50PM +0100, Michael J Gruber wrote:

 diff --git a/builtin/log.c b/builtin/log.c
 index 8f0b2e8..f83870d 100644
 --- a/builtin/log.c
 +++ b/builtin/log.c
 @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct 
 rev_info *rev)
   strbuf_release(out);
  }
  
 -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev)
 +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, 
 const char *obj_name)

Should this maybe just take the whole object_array_entry as a cleanup?

  {
 + unsigned char sha1c[20];
 + struct object_context obj_context;
 + char *buf;
 + unsigned long size;
 +
   fflush(stdout);
 - return stream_blob_to_fd(1, sha1, NULL, 0);
 + if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV))
 + return stream_blob_to_fd(1, sha1, NULL, 0);
 +
 + if (get_sha1_with_context(obj_name, 0, sha1c, obj_context))
 + die(Not a valid object name %s, obj_name);

It seems a little hacky that we have to look up the sha1 again. What
should happen in the off chance that hashcmp(sha1, sha1c) != 0 due to
a race with a simultaneous update of a ref?

Would it be better if object_array_entry replaced its mode member with
an object_context? The only downside I see is that we might waste a
significant amount of memory (each context has a PATH_MAX buffer in it).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Jeff King
On Wed, Feb 06, 2013 at 08:53:52AM -0800, Junio C Hamano wrote:

 Michael J Gruber g...@drmicha.warpmail.net writes:
 
  Currently, diff and cat-file for blobs obey --textconv options
  (with the former defaulting to --textconv and the latter to
  --no-textconv) whereas show does not obey this option, even though
  it takes diff options.
 
  Make show on blobs behave like diff, i.e. obey --textconv by
  default and --no-textconv when given.
 
 What does log -p do currently, and what should it do?  Does/should
 it also use --textconv?
 
 The --textconv is a natural extension of what --ext-diff provides us,
 so I think it should trigger the same way as how --ext-diff triggers.
 
 We apply --ext-diff for diff by default but not for log -p and
 show; I suspect this may have been for a good reason but I do not
 recall the discussion that led to the current behaviour offhand.

I think Michael's commit message explains the situation badly.
--textconv is already on for git show (and for git log) by default.
Diffs already use it.

This is more about the fact that when showing a single blob, we do not
bother to remember the context of the sha1 lookup, including its
pathname. Therefore we were not previously able to apply any
.gitattributes to the output. So this patch really does two things:

  1. Pass the information along to show_blob_object so that it can
 look up .gitattributes.

  2. Apply the textconv attribute (if ALLOW_TEXTCONV is on, of course).

And stating it that way makes it clear that there may be other missing
steps (3 and up) to apply other gitattributes. For example, should git
show $blob respect crlf attributes? Smudge filters?

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 And stating it that way makes it clear that there may be other missing
 steps (3 and up) to apply other gitattributes. For example, should git
 show $blob respect crlf attributes? Smudge filters?

Yeah, I think applying these when path is availble may make sense.

Are we going to teach cat-file to honor them with --textconv and
possibly other new options?

Or should the --textconv imply application of these other filters?
I am tempted to say yes, but I haven't thought things through...

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Jeff King
On Wed, Feb 06, 2013 at 03:49:44PM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  And stating it that way makes it clear that there may be other missing
  steps (3 and up) to apply other gitattributes. For example, should git
  show $blob respect crlf attributes? Smudge filters?
 
 Yeah, I think applying these when path is availble may make sense.
 
 Are we going to teach cat-file to honor them with --textconv and
 possibly other new options?
 
 Or should the --textconv imply application of these other filters?
 I am tempted to say yes, but I haven't thought things through...

For diff, we should already recognize them (because we feed the diff
machinery the path name, and it does the usual attributes lookup). Of
course it only uses things like funcname, not any of the
convert-to-filesystem options (hmm, actually, I guess it may use
convert-from-filesystem, but in such a case it would not be reading from
a blob anyway, but from a filesystem path, so it is not related to this
code).

For cat-file, I don't think --textconv should necessarily imply it. The
original motivation for cat-file --textconv was for external blame to
be able to access the converted contents. But it would not want to do
filesystem-level conversion; it wants the canonical version of the file.
I think a better option name for smudge/crlf would be --to-filesystem
or something like that. And probably it should not be the default.

git-show should probably have the same option. I doubt it should be on
by default, but I can see it being useful for:

  git show --to-filesystem HEAD:Makefile Makefile

or whatever. I don't think those features necessarily need to come as
part of Michael's series. They can wait for people who care to implement
them. But I do think the refactoring of passing along the path from the
object_context should keep in mind that we will probably want to go that
way eventually.

Are there other attributes that we might care about when showing a blob?
The only ones I can think of are the ones for converting to the
filesystem representation.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC/PATCH 1/4] show: obey --textconv for blobs

2013-02-06 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 git-show should probably have the same option. I doubt it should be on
 by default, but I can see it being useful for:

   git show --to-filesystem HEAD:Makefile Makefile

 or whatever. I don't think those features necessarily need to come as
 part of Michael's series.

Yeah, all makes sense (including the part that it might be useful
but it does not belong to this series).

Thanks for sanity checking.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html