Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Junio C Hamano venit, vidit, dixit 06.02.2013 17:53: Michael J Gruber g...@drmicha.warpmail.net writes: Currently, diff and cat-file for blobs obey --textconv options (with the former defaulting to --textconv and the latter to --no-textconv) whereas show does not obey this option, even though it takes diff options. Make show on blobs behave like diff, i.e. obey --textconv by default and --no-textconv when given. What does log -p do currently, and what should it do? Does/should it also use --textconv? It invokes --textconv by default, and allows to override with --no-textconv. That's what I meant to say but seem to have failed to. I think at some point we decided to have human output on by default for all things diff (porcelain only, of course) unless the diff is meant for machine consumption, such as in the case of format-patch. The --textconv is a natural extension of what --ext-diff provides us, so I think it should trigger the same way as how --ext-diff triggers. This series' aim is to make the textconv behavior the same for all human output commands. I don't see textconv and ext-diff to be strongly related, and if then the other way round: textconv is about converting blobs to a (possibly lossy) human consumable text. ext-diff is about computing diffs with an external diff tool (not in the sense of diff-tool). One way of diffing is textconving blobs then using internal diff on the resulting text blobs, but ext-diff is more general and, really, orthogonal in a way. (It used to be used often for what textconv does when that didn't exist.) We apply --ext-diff for diff by default but not for log -p and show; I suspect this may have been for a good reason but I do not recall the discussion that led to the current behaviour offhand. I don't think I changed anything ext-diff related, did I? 1/4 is about showing blobs, not diffs. Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net --- builtin/log.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 8f0b2e8..f83870d 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) { +unsigned char sha1c[20]; +struct object_context obj_context; +char *buf; +unsigned long size; + fflush(stdout); -return stream_blob_to_fd(1, sha1, NULL, 0); +if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV)) +return stream_blob_to_fd(1, sha1, NULL, 0); + +if (get_sha1_with_context(obj_name, 0, sha1c, obj_context)) +die(Not a valid object name %s, obj_name); +if (!obj_context.path[0] || +!textconv_object(obj_context.path, obj_context.mode, sha1c, 1, buf, size)) +return stream_blob_to_fd(1, sha1, NULL, 0); + +if (!buf) +die(git show %s: bad file, obj_name); + +write_or_die(1, buf, size); +return 0; } static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) @@ -491,7 +509,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) const char *name = objects[i].name; switch (o-type) { case OBJ_BLOB: -ret = show_blob_object(o-sha1, NULL); +ret = show_blob_object(o-sha1, rev, name); break; case OBJ_TAG: { struct tag *t = (struct tag *)o; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Jeff King venit, vidit, dixit 06.02.2013 23:06: On Wed, Feb 06, 2013 at 04:08:50PM +0100, Michael J Gruber wrote: diff --git a/builtin/log.c b/builtin/log.c index 8f0b2e8..f83870d 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) Should this maybe just take the whole object_array_entry as a cleanup? It's just a question of one or two/three pointers (I can't count), but yes, that would be possible. { +unsigned char sha1c[20]; +struct object_context obj_context; +char *buf; +unsigned long size; + fflush(stdout); -return stream_blob_to_fd(1, sha1, NULL, 0); +if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV)) +return stream_blob_to_fd(1, sha1, NULL, 0); + +if (get_sha1_with_context(obj_name, 0, sha1c, obj_context)) +die(Not a valid object name %s, obj_name); It seems a little hacky that we have to look up the sha1 again. What should happen in the off chance that hashcmp(sha1, sha1c) != 0 due to a race with a simultaneous update of a ref? I thought about a check here but didn't bother to because I knew the refactoring would come up again... Would it be better if object_array_entry replaced its mode member with an object_context? Do all callers/users want to deal with object_context? I'm wondering why o_c has a mode at all, since it is mostly used in conjunction with an object, isn't it? The only downside I see is that we might waste a significant amount of memory (each context has a PATH_MAX buffer in it). That's why I used a reference to the struct, see my other reply. Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
On Thu, Feb 07, 2013 at 10:05:26AM +0100, Michael J Gruber wrote: Would it be better if object_array_entry replaced its mode member with an object_context? Do all callers/users want to deal with object_context? Wouldn't it just mean replacing entry-mode with entry-oc.mode at each user? I'm wondering why o_c has a mode at all, since it is mostly used in conjunction with an object, isn't it? Just as we record the path from the surrounding tree, we record the mode. It's that mode which gets put into the pending object list by the revision parser (see the very end of handle_revision_arg). Storing an object_context instead of the mode would be a strict superset of what we store now (right now we just throw the rest away). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Jeff King venit, vidit, dixit 07.02.2013 10:11: On Thu, Feb 07, 2013 at 10:05:26AM +0100, Michael J Gruber wrote: Would it be better if object_array_entry replaced its mode member with an object_context? Do all callers/users want to deal with object_context? Wouldn't it just mean replacing entry-mode with entry-oc.mode at each user? Yes, I meant at the time of creation, i.e. when someone has to create and pass an o_a_e and maybe only knows a mode, and thus would have to set the path to NULL or . I'm wondering why o_c has a mode at all, since it is mostly used in conjunction with an object, isn't it? Just as we record the path from the surrounding tree, we record the mode. It's that mode which gets put into the pending object list by the revision parser (see the very end of handle_revision_arg). Storing an object_context instead of the mode would be a strict superset of what we store now (right now we just throw the rest away). Sure. But why does object_context have a mode member at all? Maybe it is not alway used together with another struct which has the mode already, then that's a reason. Michael -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
On Thu, Feb 07, 2013 at 10:34:26AM +0100, Michael J Gruber wrote: Just as we record the path from the surrounding tree, we record the mode. It's that mode which gets put into the pending object list by the revision parser (see the very end of handle_revision_arg). Storing an object_context instead of the mode would be a strict superset of what we store now (right now we just throw the rest away). Sure. But why does object_context have a mode member at all? Maybe it is not alway used together with another struct which has the mode already, then that's a reason. Exactly. It's purely for pulling information out of get_sha1_with_context, and does not know that you are going to put its output into an object_array_entry (and many call sites do not). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Michael J Gruber g...@drmicha.warpmail.net writes: Currently, diff and cat-file for blobs obey --textconv options (with the former defaulting to --textconv and the latter to --no-textconv) whereas show does not obey this option, even though it takes diff options. Make show on blobs behave like diff, i.e. obey --textconv by default and --no-textconv when given. What does log -p do currently, and what should it do? Does/should it also use --textconv? The --textconv is a natural extension of what --ext-diff provides us, so I think it should trigger the same way as how --ext-diff triggers. We apply --ext-diff for diff by default but not for log -p and show; I suspect this may have been for a good reason but I do not recall the discussion that led to the current behaviour offhand. Signed-off-by: Michael J Gruber g...@drmicha.warpmail.net --- builtin/log.c | 24 +--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/builtin/log.c b/builtin/log.c index 8f0b2e8..f83870d 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) { + unsigned char sha1c[20]; + struct object_context obj_context; + char *buf; + unsigned long size; + fflush(stdout); - return stream_blob_to_fd(1, sha1, NULL, 0); + if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (get_sha1_with_context(obj_name, 0, sha1c, obj_context)) + die(Not a valid object name %s, obj_name); + if (!obj_context.path[0] || + !textconv_object(obj_context.path, obj_context.mode, sha1c, 1, buf, size)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (!buf) + die(git show %s: bad file, obj_name); + + write_or_die(1, buf, size); + return 0; } static int show_tag_object(const unsigned char *sha1, struct rev_info *rev) @@ -491,7 +509,7 @@ int cmd_show(int argc, const char **argv, const char *prefix) const char *name = objects[i].name; switch (o-type) { case OBJ_BLOB: - ret = show_blob_object(o-sha1, NULL); + ret = show_blob_object(o-sha1, rev, name); break; case OBJ_TAG: { struct tag *t = (struct tag *)o; -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
On Wed, Feb 06, 2013 at 04:08:50PM +0100, Michael J Gruber wrote: diff --git a/builtin/log.c b/builtin/log.c index 8f0b2e8..f83870d 100644 --- a/builtin/log.c +++ b/builtin/log.c @@ -402,10 +402,28 @@ static void show_tagger(char *buf, int len, struct rev_info *rev) strbuf_release(out); } -static int show_blob_object(const unsigned char *sha1, struct rev_info *rev) +static int show_blob_object(const unsigned char *sha1, struct rev_info *rev, const char *obj_name) Should this maybe just take the whole object_array_entry as a cleanup? { + unsigned char sha1c[20]; + struct object_context obj_context; + char *buf; + unsigned long size; + fflush(stdout); - return stream_blob_to_fd(1, sha1, NULL, 0); + if (!DIFF_OPT_TST(rev-diffopt, ALLOW_TEXTCONV)) + return stream_blob_to_fd(1, sha1, NULL, 0); + + if (get_sha1_with_context(obj_name, 0, sha1c, obj_context)) + die(Not a valid object name %s, obj_name); It seems a little hacky that we have to look up the sha1 again. What should happen in the off chance that hashcmp(sha1, sha1c) != 0 due to a race with a simultaneous update of a ref? Would it be better if object_array_entry replaced its mode member with an object_context? The only downside I see is that we might waste a significant amount of memory (each context has a PATH_MAX buffer in it). -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
On Wed, Feb 06, 2013 at 08:53:52AM -0800, Junio C Hamano wrote: Michael J Gruber g...@drmicha.warpmail.net writes: Currently, diff and cat-file for blobs obey --textconv options (with the former defaulting to --textconv and the latter to --no-textconv) whereas show does not obey this option, even though it takes diff options. Make show on blobs behave like diff, i.e. obey --textconv by default and --no-textconv when given. What does log -p do currently, and what should it do? Does/should it also use --textconv? The --textconv is a natural extension of what --ext-diff provides us, so I think it should trigger the same way as how --ext-diff triggers. We apply --ext-diff for diff by default but not for log -p and show; I suspect this may have been for a good reason but I do not recall the discussion that led to the current behaviour offhand. I think Michael's commit message explains the situation badly. --textconv is already on for git show (and for git log) by default. Diffs already use it. This is more about the fact that when showing a single blob, we do not bother to remember the context of the sha1 lookup, including its pathname. Therefore we were not previously able to apply any .gitattributes to the output. So this patch really does two things: 1. Pass the information along to show_blob_object so that it can look up .gitattributes. 2. Apply the textconv attribute (if ALLOW_TEXTCONV is on, of course). And stating it that way makes it clear that there may be other missing steps (3 and up) to apply other gitattributes. For example, should git show $blob respect crlf attributes? Smudge filters? -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Jeff King p...@peff.net writes: And stating it that way makes it clear that there may be other missing steps (3 and up) to apply other gitattributes. For example, should git show $blob respect crlf attributes? Smudge filters? Yeah, I think applying these when path is availble may make sense. Are we going to teach cat-file to honor them with --textconv and possibly other new options? Or should the --textconv imply application of these other filters? I am tempted to say yes, but I haven't thought things through... -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
On Wed, Feb 06, 2013 at 03:49:44PM -0800, Junio C Hamano wrote: Jeff King p...@peff.net writes: And stating it that way makes it clear that there may be other missing steps (3 and up) to apply other gitattributes. For example, should git show $blob respect crlf attributes? Smudge filters? Yeah, I think applying these when path is availble may make sense. Are we going to teach cat-file to honor them with --textconv and possibly other new options? Or should the --textconv imply application of these other filters? I am tempted to say yes, but I haven't thought things through... For diff, we should already recognize them (because we feed the diff machinery the path name, and it does the usual attributes lookup). Of course it only uses things like funcname, not any of the convert-to-filesystem options (hmm, actually, I guess it may use convert-from-filesystem, but in such a case it would not be reading from a blob anyway, but from a filesystem path, so it is not related to this code). For cat-file, I don't think --textconv should necessarily imply it. The original motivation for cat-file --textconv was for external blame to be able to access the converted contents. But it would not want to do filesystem-level conversion; it wants the canonical version of the file. I think a better option name for smudge/crlf would be --to-filesystem or something like that. And probably it should not be the default. git-show should probably have the same option. I doubt it should be on by default, but I can see it being useful for: git show --to-filesystem HEAD:Makefile Makefile or whatever. I don't think those features necessarily need to come as part of Michael's series. They can wait for people who care to implement them. But I do think the refactoring of passing along the path from the object_context should keep in mind that we will probably want to go that way eventually. Are there other attributes that we might care about when showing a blob? The only ones I can think of are the ones for converting to the filesystem representation. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFC/PATCH 1/4] show: obey --textconv for blobs
Jeff King p...@peff.net writes: git-show should probably have the same option. I doubt it should be on by default, but I can see it being useful for: git show --to-filesystem HEAD:Makefile Makefile or whatever. I don't think those features necessarily need to come as part of Michael's series. Yeah, all makes sense (including the part that it might be useful but it does not belong to this series). Thanks for sanity checking. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html