Re: [PATCH v8] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible
Hello Junio Thank you for your comment. but this patch will show the source and the destination paths, both of which are truncated even more severely, because it always has to spend display columns for an extra ... (to show truncation of the source side), = (to show that it is a rename), and {,} pair (again to show that it is a rename). To be more accurate, renaming output dose not always contains { or } if there is no common part in source and destination paths, although probably there are enough large possibility to include { or }. And, in the original patch, { or } is not kept, but changed to be kept based Thomas Rast's feedback below. (So, there was no possibility to have {… = …} in the original patch.) On Oct 13, 2013, at 11:29 PM, Thomas Rast t...@thomasrast.ch wrote: Note that in the test, the generated line looks like this: {..._does_not_fit_in_a_single_line = .../path1 | 0 I don't want to go all bikesheddey, but I think it's somewhat unfortunate that the elided parts do not correspond to each other. In particular, I think the closing brace should not be omitted. Perhaps something like this would be ideal (making it up on the spot, don't count characters): {...a_single_line = ..._as_the_first}/path1 | 0 And, it might be a bit nicer for me if the patch can be rejected(or ignored as other patches) from the beginning if the concept does not fit anyway. # Though I know we can know more after seeing the implementation, anyway :-) # And, my original explanation about the patch might be not enough. Thanks ! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 22, 2013, at 9:09 PM, Junio C Hamano gits...@pobox.com wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: Also, I guess Junio might be suspicious to the idea to keep arrow(=) itself, maybe ? I think there is no single right solution to this issue, and it has to boils down to the taste. When you are viewing diff --stat -M output in wide-enough medium, you are seeing three pieces of information: what the source path was, what the destination path will be, and what amount of change is made with the change. When the output width is too narrow to show these paths, with the current code, you see truncated destination path, possibly without the source path, but this patch will show the source and the destination paths, both of which are truncated even more severely, because it always has to spend display columns for an extra ... (to show truncation of the source side), = (to show that it is a rename), and {,} pair (again to show that it is a rename). If the destination does not fit, the output before this patch would have thrown these away as part of left-truncation, to show the destination path as maximally as possible. We do not have even half the width of the current truncated to be destination only output for each path. I am afraid that in the cases where the patch makes a difference, what happens would be that you can no longer tell what source or destination paths really are, because the leading directory part gets truncated too much, and if we didn't have this patch, at least you can tell what destination path is affected. We would trade the guessability of at least one path (the destination) with just a single bit of information (an unidentifiable path got renamed to another unidentifiable path). I am not yet convinced that it is a good trade-off. Especially given the diffstat output is not about files but more about contents, between an output in the extreme case the version after the patch needs to produce {... = ...}/controller/Makefile | 7 +++ that tells us 7 lines were updated in the procedure to build some unknown controller by copying or renaming from the build procedure of some other unknown controller, and the output the current code would give to the same rename .}/fooGadget/controller/Makefile | 7 +++ that tells us 7 lines were updated in the build procedure for the foo Gadget, I think the latter contains more useful information, even though it does lose one bit of information (there was a rename involved in producing this final path) compared to the version with the patch. So you are correct to say that I am still skeptical. In any case, the output from diff --stat -M should match the output from apply --stat -M, I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible
Hello Thomas Can you briefly describe what you changed in v7 and v8, both compared to earlier versions and between v7 and v8? On v7, sfx's basename part is tried to kept. On v7, whole sfx part is tried to kept. For example, in case below: parent_path{sourceDirectory = DestinationDirectory}path1/path2//longlongFilename.txt On v7, this can be like: …{...ceDirectory = …onDirectory}.../longlongFilename.txt On v8, it will be like: …{...irectory = …irectory}path1/path2/longlongFilename.txt This change is based on the review from Junio below. (I myself is not sure what is the better way.) On Oct 17, 2013, at 10:29 PM, Junio C Hamano gits...@pobox.com wrote: I am not sure if distributing the burden of truncation equally to three parts so that the resulting pieces are of similar lengths is really a good idea. Between these two {...SourceDirectory = ...nationDirectory}...ileThatWasMoved {...ceDirectory = ...ionDirectory}nameOfTheFileThatWasMoved that attempt to show that the file nameOfTheFileThatWasMoved was moved from the longSourceDirectory to the DestinationDirectory, the latter is much more informative, I would think. On Oct 18, 2013, at 1:38 AM, Junio C Hamano gits...@pobox.com wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: In the [PATCH v7], I changed to keep filename part of suffix to handle above case, but not always keep directory part because I feel totally keeping all part of long suffix including directory name may cause output like: …{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved And, above may be worse than: ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved I think. I am not sure if I agree. Losing LongPath2 part may be more significant data loss than losing a single bit that says the change is a rename, as the latter may not quite tell us what these two directories were anyway. Also, I guess Junio might be suspicious to the idea to keep arrow(=) itself, maybe ? = (From What's cooking in git.git (Oct 2013, #04; Fri, 18)) - diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible Attempts to give more weight on the fact that a filepair represents a rename than showing substring of the actual path when diffstat lines are not wide enough. I am not sure if that is solving a right problem, though. = Thanks! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 19, 2013, at 9:24 AM, Thomas Rast t...@thomasrast.ch wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. Before this commit, this output is shortened always by omitting left most part like ...foo = barbarbar. So, if the destination filename is too long, source filename putting left or arrow can be totally omitted like ...barbarbar, without including any of foofoofoo =. In such a case where arrow symbol is omitted, there is no way to know whether the file is renamed or existed in the original. Make sure there is always an arrow, like ...foo = ...bar. The output can contain curly braces('{','}') for grouping. So, in general, the output format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b trying to have the same maximum length. If it is not enough yet, omit sfx. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- Can you briefly describe what you changed in v7 and v8, both compared to earlier versions and between v7 and v8? It would be very nice if you could always include such a patch changelog after the --- above. git-am will ignore the text between --- and the diff, so you can write comments for the reviewers there without creating noise in the commit message. Also, please keep reviewers in the Cc list for future discussion/patches so that they will see them. -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v8] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. Before this commit, this output is shortened always by omitting left most part like ...foo = barbarbar. So, if the destination filename is too long, source filename putting left or arrow can be totally omitted like ...barbarbar, without including any of foofoofoo =. In such a case where arrow symbol is omitted, there is no way to know whether the file is renamed or existed in the original. Make sure there is always an arrow, like ...foo = ...bar. The output can contain curly braces('{','}') for grouping. So, in general, the output format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b trying to have the same maximum length. If it is not enough yet, omit sfx. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- diff.c | 176 ++--- t/t4001-diff-rename.sh | 12 2 files changed, 166 insertions(+), 22 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..69c3e17 100644 --- a/diff.c +++ b/diff.c @@ -1258,11 +1258,13 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } } -static char *pprint_rename(const char *a, const char *b) +static void find_common_prefix_suffix(const char *a, const char *b, + struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx) { const char *old = a; const char *new = b; - struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1274,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { - quote_c_style(a, name, NULL, 0); - strbuf_addstr(name, = ); - quote_c_style(b, name, NULL, 0); - return strbuf_detach(name, NULL); + quote_c_style(a, a_mid, NULL, 0); + quote_c_style(b, b_mid, NULL, 0); + return; } /* Find common prefix */ @@ -1322,17 +1323,138 @@ static char *pprint_rename(const char *a, const char *b) if (b_midlen 0) b_midlen = 0; - strbuf_grow(name, pfx_length + a_midlen + b_midlen + sfx_length + 7); - if (pfx_length + sfx_length) { - strbuf_add(name, a, pfx_length); + strbuf_add(pfx, a, pfx_length); + strbuf_add(a_mid, a + pfx_length, a_midlen); + strbuf_add(b_mid, b + pfx_length, b_midlen); + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); +} + +/* + * Omit each parts to fix in name_width. + * Formatted string is pfx{a_mid = b_mid}sfx. + * At first, omit pfx as long as possible. + * If it is not enough, omit a_mid, b_mid trying to set the length of + * those 2 parts(including ...) to the same. + * If it is not enough yet, omit sfx. + * Ex: + * foofoofoo = barbarbar + * will be like + * ...foo = ...bar. + * long_parent{foofoofoo = barbarbar}path/filename + * will be like + * ...parent{...foofoo = ...barbar}path/filename + */ +static void rename_omit(struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx, + int name_width) +{ + static const char arrow[] = = ; + static const char dots[] = ...; + int use_curly_braces = (pfx-len 0) || (sfx-len 0); + size_t name_len; + size_t max_part_len = 0; + size_t remainder_part_len = 0; + size_t left, right; + size_t max_sfx_len; + size_t sfx_len; + + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow) + + (use_curly_braces ? 2 : 0); + + if (name_len = name_width) + return; /* everything fits in name_width */ + + if (use_curly_braces) { + if (strlen(dots) + (name_len - pfx-len) = name_width) { + /* +* Just omitting left of '{' is enough +* Ex: ...aaa{foofoofoo = bar}file +*/ + strbuf_splice(pfx, 0, name_len - name_width + strlen(dots), dots, strlen(dots)); + return; + } else { + if (pfx-len strlen(dots)) { + /* +* Just omitting left of '{' is not enough +* name will be ...{SOMETHING}SOMETHING +*/ + strbuf_reset(pfx); +
Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello Junio In the [PATCH v7], I changed to keep filename part of suffix to handle above case, but not always keep directory part because I feel totally keeping all part of long suffix including directory name may cause output like: …{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved And, above may be worse than: ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved I think. I am not sure if I agree. Losing LongPath2 part may be more significant data loss than losing a single bit that says the change is a rename, as the latter may not quite tell us what these two directories were anyway. I'm not sure which is the better in general. But anyway, I don't have strong opinion about this. So, I just changed to keep the all of the sfx part(lator than '}'). I just sent the updated patch as [PATCH v8]. Thanks ! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 18, 2013, at 1:38 AM, Junio C Hamano gits...@pobox.com wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: In the [PATCH v7], I changed to keep filename part of suffix to handle above case, but not always keep directory part because I feel totally keeping all part of long suffix including directory name may cause output like: …{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved And, above may be worse than: ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved I think. I am not sure if I agree. Losing LongPath2 part may be more significant data loss than losing a single bit that says the change is a rename, as the latter may not quite tell us what these two directories were anyway. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] diffcore-rename.c: Estimate filename similarity for rename detection
On rename detection like command git diff -M ..., rename is detected based on file similarities. This file similarities are calculated based on the contents of file. And, if the similarities of contents are the same, filename is taken into account. But, the similarity of filename is calculated just whether the basename is the same or not, and always returns just one or zero. So, for example, if there are multiple same files in the diff-ing commits, the result of rename detection is almost random, without taking into account the similarity of filename. Calculate filename similarities, and use the result to compare file similarity in case contents similarities are the same. Use Sorensen-Dice coefficient of bigrams in strings to calculate filename similarities because it take into account all part of the filenames, and time complexity is O(N), assuming N is the length of filenames. --- diffcore-rename.c | 81 +++ 1 file changed, 64 insertions(+), 17 deletions(-) diff --git a/diffcore-rename.c b/diffcore-rename.c index 6c7a72f..355ea6d 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -96,26 +96,51 @@ static struct diff_rename_src *register_rename_src(struct diff_filepair *p) return (rename_src[first]); } -static int basename_same(struct diff_filespec *src, struct diff_filespec *dst) +static int estimate_filename_similarity(struct diff_filespec *src, struct diff_filespec *dst) { - int src_len = strlen(src-path), dst_len = strlen(dst-path); - while (src_len dst_len) { - char c1 = src-path[--src_len]; - char c2 = dst-path[--dst_len]; - if (c1 != c2) - return 0; - if (c1 == '/') - return 1; + /* +* Calculate similarity between src path and dst path using +* Sorensen-Dice coefficient of bigrams in strings +*/ + const char *src_path = src-path; + const char *dst_path = dst-path; + size_t src_len = strlen(src_path); + size_t dst_len = strlen(dst_path); + static uint16_t src_bigram[256][256]; + int i; + int bigrams_number = 0; + int similarity; + + for (i=0; isrc_len; i++) { + int c1 = ((unsigned char*)src_path)[i]; + int c2 = ((unsigned char*)src_path)[i + 1]; + src_bigram[c1][c2] ++; + } + for (i=0; idst_len; i++) { + int c1 = ((unsigned char*)dst_path)[i]; + int c2 = ((unsigned char*)dst_path)[i + 1]; + if (src_bigram[c1][c2] 0) { + src_bigram[c1][c2] --; + bigrams_number ++; + } + } + similarity = MAX_SCORE * 2 * bigrams_number / (src_len + dst_len); + +/* Clean up src_bigram */ + for (i=0; isrc_len; i++) { + int c1 = ((unsigned char*)src_path)[i]; + int c2 = ((unsigned char*)src_path)[i + 1]; + src_bigram[c1][c2] = 0; } - return (!src_len || src-path[src_len - 1] == '/') - (!dst_len || dst-path[dst_len - 1] == '/'); + + return similarity; } struct diff_score { int src; /* index in rename_src */ int dst; /* index in rename_dst */ unsigned short score; - short name_score; + unsigned short name_score; }; static int estimate_similarity(struct diff_filespec *src, @@ -228,7 +253,7 @@ static void record_rename_pair(int dst_index, int src_index, int score) */ static int score_compare(const void *a_, const void *b_) { - const struct diff_score *a = a_, *b = b_; + struct diff_score *a = (struct diff_score *)a_, *b = (struct diff_score *)b_; /* sink the unused ones to the bottom */ if (a-dst 0) @@ -236,8 +261,23 @@ static int score_compare(const void *a_, const void *b_) else if (b-dst 0) return -1; - if (a-score == b-score) + if (a-score == b-score){ + if(a-score == 0) + return 0; + /* Calculate name_score only when both score is the same */ + if(a-name_score == USHRT_MAX){ + struct diff_filespec *two = rename_dst[a-dst].two; + struct diff_filespec *one = rename_src[a-src].p-one; + a-name_score = estimate_filename_similarity(one, two); + } + if(b-name_score == USHRT_MAX){ + struct diff_filespec *two = rename_dst[b-dst].two; + struct diff_filespec *one = rename_src[b-src].p-one; + b-name_score = estimate_filename_similarity(one, two); + } + return b-name_score - a-name_score; + } return b-score - a-score; } @@ -282,7 +322,7 @@ static int find_identical_files(struct file_similarity *src, score =
[PATCH v2] diffcore-rename.c: Estimate filename similarity for rename detection
On rename detection like command git diff -M ..., rename is detected based on file similarities. This file similarities are calculated based on the contents of file. And, if the similarities of contents are the same, filename is taken into account. But, the similarity of filename is calculated just whether the basename is the same or not, and always returns just one or zero. So, for example, if there are multiple same files in the diff-ing commits, the result of rename detection is almost random, without taking into account the similarity of filename. Calculate filename similarities, and use the result to compare file similarity in case contents similarities are the same. Use Sorensen-Dice coefficient of bigrams in strings to calculate filename similarities because it take into account all part of the filenames, and time complexity is O(N), assuming N is the length of filenames. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diffcore-rename.c | 81 +++ 1 file changed, 64 insertions(+), 17 deletions(-) diff --git a/diffcore-rename.c b/diffcore-rename.c index 6c7a72f..355ea6d 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -96,26 +96,51 @@ static struct diff_rename_src *register_rename_src(struct diff_filepair *p) return (rename_src[first]); } -static int basename_same(struct diff_filespec *src, struct diff_filespec *dst) +static int estimate_filename_similarity(struct diff_filespec *src, struct diff_filespec *dst) { - int src_len = strlen(src-path), dst_len = strlen(dst-path); - while (src_len dst_len) { - char c1 = src-path[--src_len]; - char c2 = dst-path[--dst_len]; - if (c1 != c2) - return 0; - if (c1 == '/') - return 1; + /* +* Calculate similarity between src path and dst path using +* Sorensen-Dice coefficient of bigrams in strings +*/ + const char *src_path = src-path; + const char *dst_path = dst-path; + size_t src_len = strlen(src_path); + size_t dst_len = strlen(dst_path); + static uint16_t src_bigram[256][256]; + int i; + int bigrams_number = 0; + int similarity; + + for (i=0; isrc_len; i++) { + int c1 = ((unsigned char*)src_path)[i]; + int c2 = ((unsigned char*)src_path)[i + 1]; + src_bigram[c1][c2] ++; + } + for (i=0; idst_len; i++) { + int c1 = ((unsigned char*)dst_path)[i]; + int c2 = ((unsigned char*)dst_path)[i + 1]; + if (src_bigram[c1][c2] 0) { + src_bigram[c1][c2] --; + bigrams_number ++; + } + } + similarity = MAX_SCORE * 2 * bigrams_number / (src_len + dst_len); + +/* Clean up src_bigram */ + for (i=0; isrc_len; i++) { + int c1 = ((unsigned char*)src_path)[i]; + int c2 = ((unsigned char*)src_path)[i + 1]; + src_bigram[c1][c2] = 0; } - return (!src_len || src-path[src_len - 1] == '/') - (!dst_len || dst-path[dst_len - 1] == '/'); + + return similarity; } struct diff_score { int src; /* index in rename_src */ int dst; /* index in rename_dst */ unsigned short score; - short name_score; + unsigned short name_score; }; static int estimate_similarity(struct diff_filespec *src, @@ -228,7 +253,7 @@ static void record_rename_pair(int dst_index, int src_index, int score) */ static int score_compare(const void *a_, const void *b_) { - const struct diff_score *a = a_, *b = b_; + struct diff_score *a = (struct diff_score *)a_, *b = (struct diff_score *)b_; /* sink the unused ones to the bottom */ if (a-dst 0) @@ -236,8 +261,23 @@ static int score_compare(const void *a_, const void *b_) else if (b-dst 0) return -1; - if (a-score == b-score) + if (a-score == b-score){ + if(a-score == 0) + return 0; + /* Calculate name_score only when both score is the same */ + if(a-name_score == USHRT_MAX){ + struct diff_filespec *two = rename_dst[a-dst].two; + struct diff_filespec *one = rename_src[a-src].p-one; + a-name_score = estimate_filename_similarity(one, two); + } + if(b-name_score == USHRT_MAX){ + struct diff_filespec *two = rename_dst[b-dst].two; + struct diff_filespec *one = rename_src[b-src].p-one; + b-name_score = estimate_filename_similarity(one, two); + } + return b-name_score - a-name_score; + } return b-score - a-score; } @@ -282,7 +322,7 @@ static int
[PATCH v7] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. Before this commit, this output is shortened always by omitting left most part like ...foo = barbarbar. So, if the destination filename is too long, source filename putting left or arrow can be totally omitted like ...barbarbar, without including any of foofoofoo =. In such a case where arrow symbol is omitted, there is no way to know whether the file is renamed or existed in the original. Make sure there is always an arrow, like ...foo = ...bar. The output can contain curly braces('{','}') for grouping. So, in general, the output format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the same maximum length, but as long as filename part of sfx is kept. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- diff.c | 184 +++-- t/t4001-diff-rename.sh | 12 2 files changed, 174 insertions(+), 22 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..cdf59c0 100644 --- a/diff.c +++ b/diff.c @@ -1258,11 +1258,13 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } } -static char *pprint_rename(const char *a, const char *b) +static void find_common_prefix_suffix(const char *a, const char *b, + struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx) { const char *old = a; const char *new = b; - struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1274,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { - quote_c_style(a, name, NULL, 0); - strbuf_addstr(name, = ); - quote_c_style(b, name, NULL, 0); - return strbuf_detach(name, NULL); + quote_c_style(a, a_mid, NULL, 0); + quote_c_style(b, b_mid, NULL, 0); + return; } /* Find common prefix */ @@ -1322,17 +1323,146 @@ static char *pprint_rename(const char *a, const char *b) if (b_midlen 0) b_midlen = 0; - strbuf_grow(name, pfx_length + a_midlen + b_midlen + sfx_length + 7); - if (pfx_length + sfx_length) { - strbuf_add(name, a, pfx_length); + strbuf_add(pfx, a, pfx_length); + strbuf_add(a_mid, a + pfx_length, a_midlen); + strbuf_add(b_mid, b + pfx_length, b_midlen); + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); +} + +/* + * Omit each parts to fix in name_width. + * Formatted string is pfx{a_mid = b_mid}sfx. + * At first, omit pfx as long as possible. + * If it is not enough, omit a_mid, b_mid, sfx by tring to set the length of + * those 3 parts(including ...) to the same, but keeping filename part of sfx. + * Ex: + * foofoofoo = barbarbar + * will be like + * ...foo = ...bar. + * long_parent{foofoofoo = barbarbar}path/filename + * will be like + * ...parent{...foofoo = ...barbar}.../filename + */ +static void rename_omit(struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx, + int name_width) +{ + static const char arrow[] = = ; + static const char dots[] = ...; + int use_curly_braces = (pfx-len 0) || (sfx-len 0); + size_t name_len; + size_t max_part_len = 0; + size_t remainder_part_len = 0; + size_t left, right; + size_t sfx_minlen; + char *sfx_last_slash; + size_t max_sfx_len; + + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow) + + (use_curly_braces ? 2 : 0); + + if (name_len = name_width) + return; /* everything fits in name_width */ + + if (use_curly_braces) { + if (strlen(dots) + (name_len - pfx-len) = name_width) { + /* +* Just omitting left of '{' is enough +* Ex: ...aaa{foofoofoo = bar}file +*/ + strbuf_splice(pfx, 0, name_len - name_width + strlen(dots), dots, strlen(dots)); + return; + } else { + if (pfx-len strlen(dots)) { + /* +* Just omitting left of '{' is not enough +* name will be ...{SOMETHING}SOMETHING +*/ +
Re: [PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello Junio Thank you very much for the reviewing. I try to fix the issues, and posted the updated patch as [PATCH v7]. I am not sure if distributing the burden of truncation equally to three parts so that the resulting pieces are of similar lengths is really a good idea. Between these two {...SourceDirectory = ...nationDirectory}...ileThatWasMoved {...ceDirectory = ...ionDirectory}nameOfTheFileThatWasMoved that attempt to show that the file nameOfTheFileThatWasMoved was moved from the longSourceDirectory to the DestinationDirectory, the latter is much more informative, I would think. In the [PATCH v7], I changed to keep filename part of suffix to handle above case, but not always keep directory part because I feel totally keeping all part of long suffix including directory name may cause output like: …{… = …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved And, above may be worse than: ...{...ceDirectory = …ionDirectory}.../nameOfTheFileThatWasMoved I think. Thank you ! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 17, 2013, at 10:29 PM, Junio C Hamano gits...@pobox.com wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. Before this commit, this output is shortened always by omitting left most part like ...foo = barbarbar. So, if the destination filename is too long, source filename putting left or arrow can be totally omitted like ...barbarbar, without including any of foofoofoo =. In such a case where arrow symbol is omitted, there is no way to know whether the file is renamed or existed in the original. Make sure there is always an arrow, like ...foo = ...bar. The output can contain curly braces('{','}') for grouping. So, in general, the output format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the maximum length the same because those will be equally important. I somehow find this solid wall of text extremely hard to read. Adding a blank line as a paragraph break may make it easier to read, perhaps. Also it is customary in our history to omit the full-stop from the patch title on the Subject: line. +name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow) ++ (use_curly_braces ? 2 : 0); + +if (name_len = name_width) { +/* Everthing fits in name_width */ +return; +} Logic up to this point seems good; drop {} around a single statement return;, i.e. if (name_len = name_width) return; /* everything fits */ +} else { +if (pfx-len strlen(dots)) { +/* + * Just omitting left of '{' is not enough + * name will be ...{SOMETHING}SOMETHING + */ +strbuf_reset(pfx); +strbuf_addstr(pfx, dots); +} (mental note) ... otherwise, i.e. with a short common prefix, the final result will be ab{SOMETHING}SOMETHING, which is also fine for the purpose of the remainder of this function. +} +} + +/* available length for a_mid, b_mid and sfx */ +len = name_width - strlen(arrow) - (use_curly_braces ? 2 : 0); + +/* a_mid, b_mid, sfx will be have the same max, including ellipsis(...). */ +part_length[0] = a_mid-len; +part_length[1] = b_mid-len; +part_length[2] = sfx-len; + +qsort(part_length, sizeof(part_length)/sizeof(part_length[0]), sizeof(part_length[0]) + , compare_size_t_descending_order); In our code, comma does not come at the beginning of continued line. +if (part_length[1] + part_length[1] + part_length[2] = len) { +/* + * {...foofoo = barbar}file + * There is only one omitted part. + */ +max_part_len = len - part_length[1] - part_length[2]; It would be clearer to explicitly set remainder to zero here, and omit the initialization of the variable. That would make what the three parts of if/elseif/else do more consistent. +} else if (part_length[2] + part_length[2] + part_length[2] = len) { +/* + * {...foofoo = ...barbar}file + * There are 2 omitted parts. + */ +max_part_len = (len - part_length[2]) / 2; +remainder_part_len = (len - part_length[2]) - max_part_len * 2; +} else { +/* + * {...ofoo = ...rbar}...file + * There are 3 omitted parts. + */ +max_part_len = len / 3; +remainder_part_len = len - (max_part_len) * 3
Re: [PATCH v5] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello Junio, Keshav Thank you very much for your detailed reviewing. I just tried to update the patch and posted as [PATCH v6]. --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 16, 2013, at 1:54 AM, Junio C Hamano gits...@pobox.com wrote: Yoshioka Tsuneo yoshiokatsu...@gmail.com writes: git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. But if destination filename is long, the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. Is destination filename more special than the source filename? Perhaps s/if destination filename is/if filenames are/? Note: I do not want you to reroll using the suggested wording without explanation; it may be possible that I am missing something obvious and do not understand why you singled out destination, in which case I'd rather see it explained better in the log message than the potentially suboptimal suggestion I made in the review without understanding the issue. Of course, it is possible that you want to do the same when source is overlong, in which case you can just say Yeah, you're right; will reroll. The above applies to all the other comments in this message. Also s/source commit/original/. You may not be comparing two commits after all. Make sure there is always an arrow, like ...foo = ...bar. The output can contains curly braces('{','}') for grouping. s/contains/contain/; So, in general, the outpu format is pfx{mid_a = mid_b}sfx s/outpu/t/; To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the maximum length the same because those will be equaly important. A sound reasoning. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- diff.c | 187 +++-- t/t4001-diff-rename.sh | 12 2 files changed, 177 insertions(+), 22 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..cf50807 100644 --- a/diff.c +++ b/diff.c @@ -1258,11 +1258,12 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } } -static char *pprint_rename(const char *a, const char *b) +static void pprint_rename_find_common_prefix_suffix(const char *a, const char *b + , struct strbuf *pfx, struct strbuf *a_mid + , struct strbuf *b_mid, struct strbuf *sfx) What kind of line splitting is this? I think the real issue is that the function name is overly long, but aside from that, - comma comes at the end of the line, not at the beginning of the next line; - the second and subsequent lines are indented, but not more than the usual line width (align with the first letter inside the opening parenthesis of the first line); - a_mid and b_mid are more alike than pfx and a_mid. so I would expect to see it more like: static void abbrev_rename(const char *a, const char *b, struct strbuf *pfx, struct strbuf *a_mid, struct strbuf *b_mid, struct strbuf *sfx) Note that the suggested name does not say pprint, because in your version of this file, the code around here is no longer doing any printing. The caller does so after using this function to decide how to abbreviate renames, so naming the helper function after what it does (e.g. abbreviate renames) is more appropriate. { const char *old = a; const char *new = b; -struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1273,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { -quote_c_style(a, name, NULL, 0); -strbuf_addstr(name, = ); -quote_c_style(b, name, NULL, 0); -return strbuf_detach(name, NULL); +quote_c_style(a, a_mid, NULL, 0); +quote_c_style(b, b_mid, NULL, 0); +return; } /* Find common prefix */ @@ -1321,18 +1321,151 @@ static char *pprint_rename(const char *a, const char *b) a_midlen = 0; if (b_midlen 0) b_midlen = 0; + Trailing whitespace (there are many others you added to this file; I won't bother to point out all of them). +strbuf_add(pfx, a, pfx_length); +strbuf_add(a_mid, a + pfx_length, a_midlen); +strbuf_add(b_mid, b
[PATCH v6] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. Before this commit, this output is shortened always by omitting left most part like ...foo = barbarbar. So, if the destination filename is too long, source filename putting left or arrow can be totally omitted like ...barbarbar, without including any of foofoofoo =. In such a case where arrow symbol is omitted, there is no way to know whether the file is renamed or existed in the original. Make sure there is always an arrow, like ...foo = ...bar. The output can contain curly braces('{','}') for grouping. So, in general, the output format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the maximum length the same because those will be equally important. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- diff.c | 180 +++-- t/t4001-diff-rename.sh | 12 2 files changed, 170 insertions(+), 22 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..afe6a36 100644 --- a/diff.c +++ b/diff.c @@ -1258,11 +1258,13 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } } -static char *pprint_rename(const char *a, const char *b) +static void find_common_prefix_suffix(const char *a, const char *b, + struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx) { const char *old = a; const char *new = b; - struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1274,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { - quote_c_style(a, name, NULL, 0); - strbuf_addstr(name, = ); - quote_c_style(b, name, NULL, 0); - return strbuf_detach(name, NULL); + quote_c_style(a, a_mid, NULL, 0); + quote_c_style(b, b_mid, NULL, 0); + return; } /* Find common prefix */ @@ -1322,17 +1323,142 @@ static char *pprint_rename(const char *a, const char *b) if (b_midlen 0) b_midlen = 0; - strbuf_grow(name, pfx_length + a_midlen + b_midlen + sfx_length + 7); - if (pfx_length + sfx_length) { - strbuf_add(name, a, pfx_length); + strbuf_add(pfx, a, pfx_length); + strbuf_add(a_mid, a + pfx_length, a_midlen); + strbuf_add(b_mid, b + pfx_length, b_midlen); + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); +} + +static int compare_size_t_descending_order(const void *left, const void *right) +{ + size_t left_val = *(size_t *)left; + size_t right_val = *(size_t *)right; + return (int)(right_val - left_val); +} + +/* + * Omit each parts to fix in name_width. + * Formatted string is pfx{a_mid = b_mid}sfx. + * At first, omit pfx as long as possible. + * If it is not enough, omit a_mid, b_mid, sfx by tring to set the length of + * those 3 parts(including ...) to the same. + * Ex: + * foofoofoo = barbarbar + * will be like + * ...foo = ...bar. + * long_parent{foofoofoo = barbarbar}longfilename + * will be like + * ...parent{...foofoo = ...barbar}...lename + */ +static void rename_omit(struct strbuf *pfx, + struct strbuf *a_mid, struct strbuf *b_mid, + struct strbuf *sfx, + int name_width) +{ + static const char arrow[] = = ; + static const char dots[] = ...; + int use_curly_braces = (pfx-len 0) || (sfx-len 0); + size_t name_len; + size_t len; + size_t part_length[3]; + size_t max_part_len = 0; + size_t remainder_part_len = 0; + + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(arrow) + + (use_curly_braces ? 2 : 0); + + if (name_len = name_width) { + /* Everthing fits in name_width */ + return; + } + + if (use_curly_braces) { + if (strlen(dots) + (name_len - pfx-len) = name_width) { + /* +* Just omitting left of '{' is enough +* Ex: ...aaa{foofoofoo = bar}file +*/ + strbuf_splice(pfx, 0, name_len - name_width + strlen(dots), dots, strlen(dots)); + return; + } else { + if (pfx-len strlen(dots)) { + /* +* Just omitting left of '{' is not
Re: [PATCH v3] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello Duy Thank you very much your suggestion. As you suggested, I tried to reuse intermediate result of pprint_rename(), instead of parsing the output again. I just posted the new patch as PATCH v4 Thanks ! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 14, 2013, at 10:04 PM, Duy Nguyen pclo...@gmail.com wrote: On Sun, Oct 13, 2013 at 3:48 AM, Yoshioka Tsuneo yoshiokatsu...@gmail.com wrote: git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar, but if destination filename is long the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. This commit makes it visible like ...foo = ...bar. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diff.c | 58 +++--- 1 file changed, 51 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..3aeaf3e 100644 --- a/diff.c +++ b/diff.c @@ -1643,13 +1643,57 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) len = name_width; name_len = strlen(name); if (name_width name_len) { - char *slash; - prefix = ...; - len -= 3; - name += name_len - len; - slash = strchr(name, '/'); - if (slash) - name = slash; + char *arrow = strstr(name, = ); + if (arrow) { This looks iffy. What if = is part of the path name? file-is_renamed would be a more reliable sign. In that case I think you just need an ellipsis version of pprint_rename() (i.e. drop the result of previous pprint_rename() on the floor and create a new string with ... and = in your pprint_ellipsis_rename or something) + int prefix_len = (name_width - 4) / 2; + int f_omit; + int f_brace = 0; + char *pre_arrow = alloca(name_width + 10); + char *post_arrow = arrow + 4; + char *prefix_buf = alloca(name_width + 10); + char *pre_arrow_slash = NULL; + + if (arrow - name prefix_len) { + prefix_len = (int)(arrow - name); + f_omit = 0; + } else { + prefix_len -= 3; + f_omit = 1; + if (name[0] == '{') { + prefix_len -= 1; + f_brace = 1; + } + } + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '\0'; + pre_arrow_slash = strchr(pre_arrow, '/'); + if (f_omit pre_arrow_slash) + pre_arrow = pre_arrow_slash; + sprintf(prefix_buf, %s%s%s = , (f_brace ? { : ), (f_omit ? ... : ), pre_arrow); + prefix = prefix_buf; + + if (strlen(post_arrow) name_width - strlen(prefix)) { + char *post_arrow_slash = NULL; + + post_arrow += strlen(post_arrow) - (name_width - strlen(prefix) - 3); + strcat(prefix_buf, ...); + post_arrow_slash = strchr(post_arrow, '/'); + if (post_arrow_slash) + post_arrow = post_arrow_slash; + name = post_arrow; + name_len = (int) (name_width - strlen(prefix)); + } + len -= strlen(prefix); + } else { + char *slash = NULL; + prefix = ...; + len -= 3; + name += name_len - len; + slash = strchr(name, '/'); + if (slash) + name = slash; + } } if (file-is_binary) { -- 1.8.4.475.g867697c -- To unsubscribe from this list
[PATCH v5] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. But if destination filename is long, the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. Make sure there is always an arrow, like ...foo = ...bar. The output can contains curly braces('{','}') for grouping. So, in general, the outpu format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the maximum length the same because those will be equaly important. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com Test-added-by: Thomas Rast tr...@inf.ethz.ch --- diff.c | 187 +++-- t/t4001-diff-rename.sh | 12 2 files changed, 177 insertions(+), 22 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..cf50807 100644 --- a/diff.c +++ b/diff.c @@ -1258,11 +1258,12 @@ static void fn_out_consume(void *priv, char *line, unsigned long len) } } -static char *pprint_rename(const char *a, const char *b) +static void pprint_rename_find_common_prefix_suffix(const char *a, const char *b + , struct strbuf *pfx, struct strbuf *a_mid + , struct strbuf *b_mid, struct strbuf *sfx) { const char *old = a; const char *new = b; - struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1273,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { - quote_c_style(a, name, NULL, 0); - strbuf_addstr(name, = ); - quote_c_style(b, name, NULL, 0); - return strbuf_detach(name, NULL); + quote_c_style(a, a_mid, NULL, 0); + quote_c_style(b, b_mid, NULL, 0); + return; } /* Find common prefix */ @@ -1321,18 +1321,151 @@ static char *pprint_rename(const char *a, const char *b) a_midlen = 0; if (b_midlen 0) b_midlen = 0; + + strbuf_add(pfx, a, pfx_length); + strbuf_add(a_mid, a + pfx_length, a_midlen); + strbuf_add(b_mid, b + pfx_length, b_midlen); + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); +} + +/* + * Omit each parts to fix in name_width. + * Formatted string is pfx{a_mid = b_mid}sfx. + * At first, omit pfx as long as possible. + * If it is not enough, omit a_mid, b_mid, sfx by tring to set the length of + * those 3 parts(including ...) to the same. + * Ex: + * foofoofoo = barbarbar + * will be like + * ...foo = ...bar. + * long_parent{foofoofoo = barbarbar}longfilename + * will be like + * ...parent{...foofoo = ...barbar}...lename + */ +static void pprint_rename_omit(struct strbuf *pfx, struct strbuf *a_mid, struct strbuf *b_mid + , struct strbuf *sfx, int name_width) +{ + +#define ARROW = +#define ELLIPSIS ... +#define swap(a, b) myswap((a), (b), sizeof(a)) + +#define myswap(a, b, size) do {\ +unsigned char mytmp[size]; \ +memcpy(mytmp, a, size); \ +memcpy(a, b, size); \ +memcpy(b, mytmp, size); \ +} while (0) + + int use_curly_braces = (pfx-len 0) || (sfx-len 0); + size_t name_len; + size_t len; + size_t part_lengths[4]; + size_t max_part_len = 0; + size_t remainder_part_len = 0; + int i, j; + + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(ARROW) + + (use_curly_braces ? 2 : 0); + + if (name_len = name_width) { + /* Everthing fits in name_width */ + return; + } + + if (use_curly_braces) { + if (strlen(ELLIPSIS) + (name_len - pfx-len) = name_width) { + /* +Just omitting left of '{' is enough +Ex: ...aaa{foofoofoo = bar}file +*/ + strbuf_splice(pfx, name_len - pfx-len, name_width - (name_len - pfx-len), ELLIPSIS, strlen(ELLIPSIS)); + return; + } else { + if (pfx-len strlen(ELLIPSIS)) { + /* +Just omitting left of '{' is not enough +name will be ...{SOMETHING}SOMETHING +*/ + strbuf_reset(pfx); +
Re: [PATCH v4] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello Felipe Thank you for pointing out the style issue again. I just fixed it and posted as [PATCH v5]. Thanks! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 15, 2013, at 1:07 PM, Felipe Contreras felipe.contre...@gmail.com wrote: On Tue, Oct 15, 2013 at 4:45 AM, Yoshioka Tsuneo yoshiokatsu...@gmail.com wrote: git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar. But if destination filename is long, the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. Make sure there is always an arrow, like ...foo = ...bar. The output can contains curly braces('{','}') for grouping. So, in general, the outpu format is pfx{mid_a = mid_b}sfx To keep arrow(=), try to omit pfx as long as possible at first because later part or changing part will be the more important part. If it is not enough, shorten mid_a, mid_b, and sfx trying to have the maximum length the same because those will be equaly important. This has similar style issues as v1. -static char *pprint_rename(const char *a, const char *b) +static void pprint_rename_find_common_prefix_suffix(const char *a, const char *b, struct strbuf *pfx, struct strbuf *a_mid, struct strbuf *b_mid, struct strbuf *sfx) { const char *old = a; const char *new = b; - struct strbuf name = STRBUF_INIT; int pfx_length, sfx_length; int pfx_adjust_for_slash; int len_a = strlen(a); @@ -1272,10 +1271,9 @@ static char *pprint_rename(const char *a, const char *b) int qlen_b = quote_c_style(b, NULL, NULL, 0); if (qlen_a || qlen_b) { - quote_c_style(a, name, NULL, 0); - strbuf_addstr(name, = ); - quote_c_style(b, name, NULL, 0); - return strbuf_detach(name, NULL); + quote_c_style(a, a_mid, NULL, 0); + quote_c_style(b, b_mid, NULL, 0); + return; } /* Find common prefix */ @@ -1321,18 +1319,149 @@ static char *pprint_rename(const char *a, const char *b) a_midlen = 0; if (b_midlen 0) b_midlen = 0; + + strbuf_add(pfx, a, pfx_length); + strbuf_add(a_mid, a + pfx_length, a_midlen); + strbuf_add(b_mid, b + pfx_length, b_midlen); + strbuf_add(sfx, a + len_a - sfx_length, sfx_length); +} - strbuf_grow(name, pfx_length + a_midlen + b_midlen + sfx_length + 7); - if (pfx_length + sfx_length) { - strbuf_add(name, a, pfx_length); +/* + * Omit each parts to fix in name_width. + * Formatted string is pfx{a_mid = b_mid}sfx. + * At first, omit pfx as long as possible. + * If it is not enough, omit a_mid, b_mid, sfx by tring to set the length of + * those 3 parts(including ...) to the same. + * Ex: + * foofoofoo = barbarbar + * will be like + * ...foo = ...bar. + * long_parent{foofoofoo = barbarbar}longfilename + * will be like + * ...parent{...foofoo = ...barbar}...lename + */ +static void pprint_rename_omit(struct strbuf *pfx, struct strbuf *a_mid, struct strbuf *b_mid, struct strbuf *sfx, int name_width) Seems like this line needs to be broken. +{ + +#define ARROW = +#define ELLIPSIS ... +#define swap(a,b) myswap((a),(b),sizeof(a)) I'm not entirely sure, but I think this should be: #define swap(a, b) myswap((a), (b), sizeof(a)) + +#define myswap(a, b, size) do {\ +unsigned char mytmp[size]; \ +memcpy(mytmp, a, size); \ +memcpy(a, b, size); \ +memcpy(b, mytmp, size); \ +} while (0) + + int use_curly_braces = (pfx-len 0) || (sfx-len 0); + size_t name_len; + size_t len; + size_t part_lengths[4]; + size_t max_part_len = 0; + size_t remainder_part_len = 0; + int i, j; + + name_len = pfx-len + a_mid-len + b_mid-len + sfx-len + strlen(ARROW) + (use_curly_braces?2:0); + + if (name_len = name_width){ if () { + /* Everthing fits in name_width */ + return; + } + + if(use_curly_braces){ Ditto. + if(strlen(ELLIPSIS) + (name_len - pfx-len) = name_width){ Ditto. + /* +Just omitting left of '{' is enough +Ex: ...aaa{foofoofoo = bar}file +*/ + strbuf_splice(pfx, name_len - pfx-len, name_width - (name_len - pfx-len), ELLIPSIS, strlen(ELLIPSIS)); + return; + }else{ } else { + if (pfx-len strlen(ELLIPSIS)) { + /* +Just omitting left of '{' is not enough +name will be ...{SOMETHING}SOMETHING
[PATCH v3] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar, but if destination filename is long the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. This commit makes it visible like ...foo = ...bar. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diff.c | 58 +++--- 1 file changed, 51 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..3aeaf3e 100644 --- a/diff.c +++ b/diff.c @@ -1643,13 +1643,57 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) len = name_width; name_len = strlen(name); if (name_width name_len) { - char *slash; - prefix = ...; - len -= 3; - name += name_len - len; - slash = strchr(name, '/'); - if (slash) - name = slash; + char *arrow = strstr(name, = ); + if (arrow) { + int prefix_len = (name_width - 4) / 2; + int f_omit; + int f_brace = 0; + char *pre_arrow = alloca(name_width + 10); + char *post_arrow = arrow + 4; + char *prefix_buf = alloca(name_width + 10); + char *pre_arrow_slash = NULL; + + if (arrow - name prefix_len) { + prefix_len = (int)(arrow - name); + f_omit = 0; + } else { + prefix_len -= 3; + f_omit = 1; + if (name[0] == '{') { + prefix_len -= 1; + f_brace = 1; + } + } + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '\0'; + pre_arrow_slash = strchr(pre_arrow, '/'); + if (f_omit pre_arrow_slash) + pre_arrow = pre_arrow_slash; + sprintf(prefix_buf, %s%s%s = , (f_brace ? { : ), (f_omit ? ... : ), pre_arrow); + prefix = prefix_buf; + + if (strlen(post_arrow) name_width - strlen(prefix)) { + char *post_arrow_slash = NULL; + + post_arrow += strlen(post_arrow) - (name_width - strlen(prefix) - 3); + strcat(prefix_buf, ...); + post_arrow_slash = strchr(post_arrow, '/'); + if (post_arrow_slash) + post_arrow = post_arrow_slash; + name = post_arrow; + name_len = (int) (name_width - strlen(prefix)); + } + len -= strlen(prefix); + } else { + char *slash = NULL; + prefix = ...; + len -= 3; + name += name_len - len; + slash = strchr(name, '/'); + if (slash) + name = slash; + } } if (file-is_binary) { -- 1.8.4.475.g867697c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v3] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar, but if destination filename is long the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. This commit makes it visible like ...foo = ...bar. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diff.c | 58 +++--- 1 file changed, 51 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..3aeaf3e 100644 --- a/diff.c +++ b/diff.c @@ -1643,13 +1643,57 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) len = name_width; name_len = strlen(name); if (name_width name_len) { - char *slash; - prefix = ...; - len -= 3; - name += name_len - len; - slash = strchr(name, '/'); - if (slash) - name = slash; + char *arrow = strstr(name, = ); + if (arrow) { + int prefix_len = (name_width - 4) / 2; + int f_omit; + int f_brace = 0; + char *pre_arrow = alloca(name_width + 10); + char *post_arrow = arrow + 4; + char *prefix_buf = alloca(name_width + 10); + char *pre_arrow_slash = NULL; + + if (arrow - name prefix_len) { + prefix_len = (int)(arrow - name); + f_omit = 0; + } else { + prefix_len -= 3; + f_omit = 1; + if (name[0] == '{') { + prefix_len -= 1; + f_brace = 1; + } + } + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '\0'; + pre_arrow_slash = strchr(pre_arrow, '/'); + if (f_omit pre_arrow_slash) + pre_arrow = pre_arrow_slash; + sprintf(prefix_buf, %s%s%s = , (f_brace ? { : ), (f_omit ? ... : ), pre_arrow); + prefix = prefix_buf; + + if (strlen(post_arrow) name_width - strlen(prefix)) { + char *post_arrow_slash = NULL; + + post_arrow += strlen(post_arrow) - (name_width - strlen(prefix) - 3); + strcat(prefix_buf, ...); + post_arrow_slash = strchr(post_arrow, '/'); + if (post_arrow_slash) + post_arrow = post_arrow_slash; + name = post_arrow; + name_len = (int) (name_width - strlen(prefix)); + } + len -= strlen(prefix); + } else { + char *slash = NULL; + prefix = ...; + len -= 3; + name += name_len - len; + slash = strchr(name, '/'); + if (slash) + name = slash; + } } if (file-is_binary) { -- 1.8.4.475.g867697c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [spf:guess,mismatch] [PATCH v2] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
Hello On Oct 12, 2013, at 8:35 AM, Keshav Kini keshav.k...@gmail.com wrote: Sam Vilain s...@vilain.net writes: On 10/11/2013 06:07 AM, Yoshioka Tsuneo wrote: + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '¥0'; This seems to be an encoding mistake; was this supposed to be an ASCII arrow? That's supposed to be a null terminator character, '\0'. See https://en.wikipedia.org/wiki/Yen_symbol#Code_page_932 Thank you for pointing out the encoding issue. It looks, I need to change encoding of pbcopy command to en_US.UTF-8 from C on Mac OS X. I just sent updated patch as PATCH v3. Thanks ! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 12, 2013, at 8:35 AM, Keshav Kini keshav.k...@gmail.com wrote: Sam Vilain s...@vilain.net writes: On 10/11/2013 06:07 AM, Yoshioka Tsuneo wrote: + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '¥0'; This seems to be an encoding mistake; was this supposed to be an ASCII arrow? That's supposed to be a null terminator character, '\0'. See https://en.wikipedia.org/wiki/Yen_symbol#Code_page_932 -Keshav -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar, but if destination filename is long the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. This commit makes it visible like ...foo = ...bar. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diff.c | 54 +++--- 1 file changed, 47 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..9b55546 100644 --- a/diff.c +++ b/diff.c @@ -1643,13 +1643,53 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) len = name_width; name_len = strlen(name); if (name_width name_len) { - char *slash; - prefix = ...; - len -= 3; - name += name_len - len; - slash = strchr(name, '/'); - if (slash) - name = slash; + char *arrow = strstr(name, = ); + if(arrow){ + int prefix_len = (name_width- 4) / 2; + int f_omit; + char *pre_arrow = alloca(name_width + 10); + char *post_arrow = arrow + 4; + char *prefix_buf = alloca(name_width + 10); + char *pre_arrow_slash = NULL; + + if(arrow - name prefix_len){ + prefix_len = (int)(arrow - name); + f_omit = 0; + }else{ + prefix_len -= 3; + f_omit = 1; + } + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '¥0'; + pre_arrow_slash = strchr(pre_arrow, '/'); + if(f_omit pre_arrow_slash){ + pre_arrow = pre_arrow_slash; + } + sprintf(prefix_buf, %s%s = , (f_omit ? ... : ), pre_arrow); + prefix = prefix_buf; + + if(strlen(post_arrow) name_width - strlen(prefix)){ + char *post_arrow_slash = NULL; + + post_arrow += strlen(post_arrow) - (name_width - strlen(prefix) - 3); + strcat(prefix_buf, ...); + post_arrow_slash = strchr(post_arrow, '/'); + if(post_arrow_slash){ + post_arrow = post_arrow_slash; + } + name = post_arrow; + name_len = (int) (name_width - strlen(prefix)); + } + len -= strlen(prefix); + }else{ + char *slash = NULL; + prefix = ...; + len -= 3; + name += name_len - len; + slash = strchr(name, '/'); + if (slash) + name = slash; + } } if (file-is_binary) { -- 1.8.4.475.g867697c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH v2] diff.c: keep arrow(=) on show_stats()'s shortened filename part to make rename visible.
git diff -M --stat can detect rename and show renamed file name like foofoofoo = barbarbar, but if destination filename is long the line is shortened like ...barbarbar so there is no way to know whether the file is renamed or existed in the source commit. This commit makes it visible like ...foo = ...bar. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diff.c | 58 +++--- 1 file changed, 51 insertions(+), 7 deletions(-) diff --git a/diff.c b/diff.c index a04a34d..3aeaf3e 100644 --- a/diff.c +++ b/diff.c @@ -1643,13 +1643,57 @@ static void show_stats(struct diffstat_t *data, struct diff_options *options) len = name_width; name_len = strlen(name); if (name_width name_len) { - char *slash; - prefix = ...; - len -= 3; - name += name_len - len; - slash = strchr(name, '/'); - if (slash) - name = slash; + char *arrow = strstr(name, = ); + if (arrow) { + int prefix_len = (name_width - 4) / 2; + int f_omit; + int f_brace = 0; + char *pre_arrow = alloca(name_width + 10); + char *post_arrow = arrow + 4; + char *prefix_buf = alloca(name_width + 10); + char *pre_arrow_slash = NULL; + + if (arrow - name prefix_len) { + prefix_len = (int)(arrow - name); + f_omit = 0; + } else { + prefix_len -= 3; + f_omit = 1; + if (name[0] == '{') { + prefix_len -= 1; + f_brace = 1; + } + } + prefix_len = ((prefix_len = 0) ? prefix_len : 0); + strncpy(pre_arrow, arrow - prefix_len, prefix_len); + pre_arrow[prefix_len] = '¥0'; + pre_arrow_slash = strchr(pre_arrow, '/'); + if (f_omit pre_arrow_slash) + pre_arrow = pre_arrow_slash; + sprintf(prefix_buf, %s%s%s = , (f_brace ? { : ), (f_omit ? ... : ), pre_arrow); + prefix = prefix_buf; + + if (strlen(post_arrow) name_width - strlen(prefix)) { + char *post_arrow_slash = NULL; + + post_arrow += strlen(post_arrow) - (name_width - strlen(prefix) - 3); + strcat(prefix_buf, ...); + post_arrow_slash = strchr(post_arrow, '/'); + if (post_arrow_slash) + post_arrow = post_arrow_slash; + name = post_arrow; + name_len = (int) (name_width - strlen(prefix)); + } + len -= strlen(prefix); + } else { + char *slash = NULL; + prefix = ...; + len -= 3; + name += name_len - len; + slash = strchr(name, '/'); + if (slash) + name = slash; + } } if (file-is_binary) { -- 1.8.4.475.g867697c -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] diffcore-delta: optimize renames and copies detection (git diff -M/-C)
diffcore_count_changes() can return -1 when src_copied is greater than delta_limit, without counting all the src_copied. By that, performance of diff -M/-C can be improved. Signed-off-by: Tsuneo Yoshioka yoshiokatsu...@gmail.com --- diffcore-delta.c | 11 --- diffcore-rename.c | 2 +- 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/diffcore-delta.c b/diffcore-delta.c index 7cf431d..0a9290e 100644 --- a/diffcore-delta.c +++ b/diffcore-delta.c @@ -173,7 +173,7 @@ int diffcore_count_changes(struct diff_filespec *src, { struct spanhash *s, *d; struct spanhash_top *src_count, *dst_count; - unsigned long sc, la; + unsigned long sc, not_sc, la; src_count = dst_count = NULL; if (src_count_p) @@ -190,7 +190,7 @@ int diffcore_count_changes(struct diff_filespec *src, if (dst_count_p) *dst_count_p = dst_count; } - sc = la = 0; + sc = not_sc = la = 0; s = src_count-data; d = dst_count-data; @@ -214,8 +214,13 @@ int diffcore_count_changes(struct diff_filespec *src, la += dst_cnt - src_cnt; sc += src_cnt; } - else + else{ sc += dst_cnt; + not_sc += (src_cnt - dst_cnt); + if(delta_limit != 0 not_sc delta_limit){ + return -1; + } + } s++; } while (d-cnt) { diff --git a/diffcore-rename.c b/diffcore-rename.c index 6c7a72f..d52b2c8 100644 --- a/diffcore-rename.c +++ b/diffcore-rename.c @@ -181,7 +181,7 @@ static int estimate_similarity(struct diff_filespec *src, return 0; delta_limit = (unsigned long) - (base_size * (MAX_SCORE-minimum_score) / MAX_SCORE); + (max_size * (MAX_SCORE-minimum_score) / MAX_SCORE); if (diffcore_count_changes(src, dst, src-cnt_data, dst-cnt_data, delta_limit, -- 1.7.12.4 (Apple Git-37) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html