Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
On Tue, Jul 02, 2013 at 09:22:09AM +0200, Johannes Sixt wrote: Am 7/2/2013 0:50, schrieb Alexey Shumkin: On Mon, Jul 01, 2013 at 09:00:55AM +0200, Johannes Sixt wrote: Am 6/26/2013 12:19, schrieb Alexey Shumkin: test_expect_success 'setup complex body' ' git config i18n.commitencoding iso8859-1 echo change2 foo git commit -a -F commit-msg head3=$(git rev-parse --verify HEAD) - head3_short=$(git rev-parse --short $head3) + head3_short=$(git rev-parse --short $head3) + # unset commit encoding config + # otherwise %e does not print encoding value + # and following test fails I don't understand this comment. The test vector below already shows that an encoding is printed. Why would this suddenly be different with the updated tests? I've changed tests. I've reverted back these ones, and added new ones with no i18n.commitEncoding set Assuming that this change doesn't sweep a deeper problem under the rug, it's better to use test_config a few lines earlier. + git config --unset i18n.commitEncoding + ' test_format complex-encoding %e EOF commit $head3 iso8859-1 This is the encoding that I mean. These encodings have appeared because we've changed 'setup': we make commits with i18n.commitEncoding set I understand why there are additional encoding entries in the expected output, but we see one encoding entry already listed without this patch. Why do you say does not print encoding value in the comment above? I don't even remember today. I guess (that comment initially was written loong time time ago), that was a legacy comment. Nevermind, nowadays it's removed ;) commit $head2 +iso-8859-1 commit $head1 +iso-8859-1 EOF -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
Am 7/2/2013 0:50, schrieb Alexey Shumkin: On Mon, Jul 01, 2013 at 09:00:55AM +0200, Johannes Sixt wrote: Am 6/26/2013 12:19, schrieb Alexey Shumkin: test_expect_success 'setup complex body' ' git config i18n.commitencoding iso8859-1 echo change2 foo git commit -a -F commit-msg head3=$(git rev-parse --verify HEAD) - head3_short=$(git rev-parse --short $head3) + head3_short=$(git rev-parse --short $head3) + # unset commit encoding config + # otherwise %e does not print encoding value + # and following test fails I don't understand this comment. The test vector below already shows that an encoding is printed. Why would this suddenly be different with the updated tests? I've changed tests. I've reverted back these ones, and added new ones with no i18n.commitEncoding set Assuming that this change doesn't sweep a deeper problem under the rug, it's better to use test_config a few lines earlier. + git config --unset i18n.commitEncoding + ' test_format complex-encoding %e EOF commit $head3 iso8859-1 This is the encoding that I mean. These encodings have appeared because we've changed 'setup': we make commits with i18n.commitEncoding set I understand why there are additional encoding entries in the expected output, but we see one encoding entry already listed without this patch. Why do you say does not print encoding value in the comment above? commit $head2 +iso-8859-1 commit $head1 +iso-8859-1 EOF -- Hannes -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
Am 6/26/2013 12:19, schrieb Alexey Shumkin: One can set an alias $ git config alias.lg log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)%an%Creset' --abbrev-commit --date=local to see the log as a pretty tree (like *gitk* but in a terminal). However, log messages written in an encoding i18n.commitEncoding which differs from terminal encoding are shown corrupted even when i18n.logOutputEncoding and terminal encoding are the same (e.g. log messages committed on a Cygwin box with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa). To simplify an example we can say the following two commands are expected to give the same output to a terminal: $ git log --oneline --no-color $ git log --pretty=format:'%h %s' However, the former pays attention to i18n.logOutputEncoding configuration, while the latter does not when it formats %s. The same corruption is true for $ git diff --submodule=log and $ git rev-list --pretty=format:%s HEAD and $ git reset --hard This patch adds failing tests for the next patch that fixes them. Signed-off-by: Alexey Shumkin alex.crez...@gmail.com diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh index 73ba5e8..6b62da2 100755 --- a/t/t4205-log-pretty-formats.sh +++ b/t/t4205-log-pretty-formats.sh ... +commit_msg() { + # String initial. initial partly in German (translated with Google Translate), + # encoded in UTF-8, used as a commit log message below. + msg=$(printf initial. anf\303\244nglich) + if test -n $1 + then + msg=$(echo $msg | iconv -f utf-8 -t $1) + fi + if test -n $2 -a -n $3 + then + # cut string, replace cut part with two dots + # $2 - chars count from the beginning of the string + # $3 - trailing chars + # LC_ALL is set to make `sed` interpret . as a UTF-8 char not a byte + # as it does with C locale + msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e s/^\(.\{$2\}\)$3/\1../) This does not work as expected on Windows because sed ignores the .UTF-8 part of the locale specifier. (We don't even have en_US; we have de, but with de.UTF-8 this doesn't work, either.) I don't have an idea, yet, how to work it around. + fi + echo $msg +} -test_expect_success 'left alignment formatting with mtrunc' ' - git log --pretty=format:%(10,mtrunc)%s actual +test_expect_failure 'left alignment formatting with mtrunc' + git log --pretty='format:%(10,mtrunc)%s' actual # complete the incomplete line at the end echo actual qz_to_tab_space \EOF expected mess.. two mess.. one add bar Z -initial Z +$(commit_msg 4 .\{11\}) EOF test_cmp expected actual -' + This is the failing test case. BTW, if you re-roll, there would be fewer changes needed if you kept the test code single-quoted, but changed \EOF to EOF where needed. diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh index cc1008d..c66a07f 100755 --- a/t/t6006-rev-list-format.sh +++ b/t/t6006-rev-list-format.sh ... test_expect_success 'setup' ' : foo git add foo - git commit -m added foo + git config i18n.commitEncoding iso-8859-1 Perhaps test_config i18n.commitEncoding iso-8859-1 Also, it is iso-8869-1 here, but we see iso8859-1 already used later. It's probably wise to use that same encoding name everywhere because we can be very sure that the latter is already understood on all supported platforms. + git commit -m $added_iso88591 head1=$(git rev-parse --verify HEAD) head1_short=$(git rev-parse --verify --short $head1) tree1=$(git rev-parse --verify HEAD:) tree1_short=$(git rev-parse --verify --short $tree1) - echo changed foo - git commit -a -m changed foo + echo $changed foo + git commit -a -m $changed_iso88591 head2=$(git rev-parse --verify HEAD) head2_short=$(git rev-parse --verify --short $head2) tree2=$(git rev-parse --verify HEAD:) tree2_short=$(git rev-parse --verify --short $tree2) + git config --unset i18n.commitEncoding ' -# usage: test_format name format_string expected_output +# usage: test_format [failure] name format_string expected_output test_format () { + must_fail=0 + # if parameters count is more than 2 then test must fail + if test $# -gt 2 + then + must_fail=1 + # remove first parameter which is flag for test failure + shift + fi cat expect.$1 - test_expect_success format $1 - git rev-list --pretty=format:'$2' master output.$1 - test_cmp expect.$1 output.$1 - + name=format $1 + command=git rev-list --pretty=format:'$2' master output.$1 +
Re: [PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
On Mon, Jul 01, 2013 at 09:00:55AM +0200, Johannes Sixt wrote: Am 6/26/2013 12:19, schrieb Alexey Shumkin: One can set an alias $ git config alias.lg log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)%an%Creset' --abbrev-commit --date=local to see the log as a pretty tree (like *gitk* but in a terminal). However, log messages written in an encoding i18n.commitEncoding which differs from terminal encoding are shown corrupted even when i18n.logOutputEncoding and terminal encoding are the same (e.g. log messages committed on a Cygwin box with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa). To simplify an example we can say the following two commands are expected to give the same output to a terminal: $ git log --oneline --no-color $ git log --pretty=format:'%h %s' However, the former pays attention to i18n.logOutputEncoding configuration, while the latter does not when it formats %s. The same corruption is true for $ git diff --submodule=log and $ git rev-list --pretty=format:%s HEAD and $ git reset --hard This patch adds failing tests for the next patch that fixes them. Signed-off-by: Alexey Shumkin alex.crez...@gmail.com diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh index 73ba5e8..6b62da2 100755 --- a/t/t4205-log-pretty-formats.sh +++ b/t/t4205-log-pretty-formats.sh ... +commit_msg() { + # String initial. initial partly in German (translated with Google Translate), + # encoded in UTF-8, used as a commit log message below. + msg=$(printf initial. anf\303\244nglich) + if test -n $1 + then + msg=$(echo $msg | iconv -f utf-8 -t $1) + fi + if test -n $2 -a -n $3 + then + # cut string, replace cut part with two dots + # $2 - chars count from the beginning of the string + # $3 - trailing chars + # LC_ALL is set to make `sed` interpret . as a UTF-8 char not a byte + # as it does with C locale + msg=$(echo $msg | LC_ALL=en_US.UTF-8 sed -e s/^\(.\{$2\}\)$3/\1../) This does not work as expected on Windows because sed ignores the .UTF-8 part of the locale specifier. (We don't even have en_US; we have de, but with de.UTF-8 this doesn't work, either.) I don't have an idea, yet, how to work it around. Hmm. I have Cygwin v1.7 (on Windows 7 and Windows 2003 Server R2) with many locales installed (and with en_US.UTF-8 locale, too) Today I could not find a way to run tests with no en_US.UTF-8 locale installed simulation to test your failure + fi + echo $msg +} -test_expect_success 'left alignment formatting with mtrunc' ' - git log --pretty=format:%(10,mtrunc)%s actual +test_expect_failure 'left alignment formatting with mtrunc' + git log --pretty='format:%(10,mtrunc)%s' actual # complete the incomplete line at the end echo actual qz_to_tab_space \EOF expected mess.. two mess.. one add bar Z -initial Z +$(commit_msg 4 .\{11\}) EOF test_cmp expected actual -' + This is the failing test case. Hmm, for me all these tests pass on both Linux and Cygwin (mentioned above) boxes BTW, if you re-roll, there would be fewer changes needed if you kept the test code single-quoted, but changed \EOF to EOF where needed. Yep, thanks for your correction diff --git a/t/t6006-rev-list-format.sh b/t/t6006-rev-list-format.sh index cc1008d..c66a07f 100755 --- a/t/t6006-rev-list-format.sh +++ b/t/t6006-rev-list-format.sh ... test_expect_success 'setup' ' : foo git add foo - git commit -m added foo + git config i18n.commitEncoding iso-8859-1 Perhaps test_config i18n.commitEncoding iso-8859-1 Also, it is iso-8869-1 here, but we see iso8859-1 already used later. It's probably wise to use that same encoding name everywhere because we can be very sure that the latter is already understood on all supported platforms. You're right (I've looked at explanation in 3994e8a98dc7bbf67e61d23c8125f44383499a1f; I've thought ISO-8859-1 is a common name). + git commit -m $added_iso88591 head1=$(git rev-parse --verify HEAD) head1_short=$(git rev-parse --verify --short $head1) tree1=$(git rev-parse --verify HEAD:) tree1_short=$(git rev-parse --verify --short $tree1) - echo changed foo - git commit -a -m changed foo + echo $changed foo + git commit -a -m $changed_iso88591 head2=$(git rev-parse --verify HEAD) head2_short=$(git rev-parse --verify --short $head2) tree2=$(git rev-parse --verify HEAD:) tree2_short=$(git rev-parse --verify --short $tree2) + git config --unset i18n.commitEncoding ' -# usage: test_format name format_string expected_output +# usage: test_format
[PATCH v7 4/5] pretty: Add failing tests: --format output should honor logOutputEncoding
One can set an alias $ git config alias.lg log --graph --pretty=format:'%Cred%h%Creset -%C(yellow)%d%Creset %s %Cgreen(%cd) %C(bold blue)%an%Creset' --abbrev-commit --date=local to see the log as a pretty tree (like *gitk* but in a terminal). However, log messages written in an encoding i18n.commitEncoding which differs from terminal encoding are shown corrupted even when i18n.logOutputEncoding and terminal encoding are the same (e.g. log messages committed on a Cygwin box with Windows-1251 encoding seen on a Linux box with a UTF-8 encoding and vice versa). To simplify an example we can say the following two commands are expected to give the same output to a terminal: $ git log --oneline --no-color $ git log --pretty=format:'%h %s' However, the former pays attention to i18n.logOutputEncoding configuration, while the latter does not when it formats %s. The same corruption is true for $ git diff --submodule=log and $ git rev-list --pretty=format:%s HEAD and $ git reset --hard This patch adds failing tests for the next patch that fixes them. Signed-off-by: Alexey Shumkin alex.crez...@gmail.com --- t/t4041-diff-submodule-option.sh | 35 + t/t4205-log-pretty-formats.sh| 149 --- t/t6006-rev-list-format.sh | 83 +++--- t/t7102-reset.sh | 29 +++- 4 files changed, 199 insertions(+), 97 deletions(-) diff --git a/t/t4041-diff-submodule-option.sh b/t/t4041-diff-submodule-option.sh index 32d4a60..2a7877d 100755 --- a/t/t4041-diff-submodule-option.sh +++ b/t/t4041-diff-submodule-option.sh @@ -1,6 +1,7 @@ #!/bin/sh # # Copyright (c) 2009 Jens Lehmann, based on t7401 by Ping Yin +# Copyright (c) 2011 Alexey Shumkin (+ non-UTF-8 commit encoding tests) # test_description='Support for verbose submodule differences in git diff @@ -10,6 +11,9 @@ This test tries to verify the sanity of the --submodule option of git diff. . ./test-lib.sh +# String added in German (translated with Google Translate), encoded in UTF-8, +# used in sample commit log messages in add_file() function below. +added=$(printf hinzugef\303\274gt) add_file () { ( cd $1 @@ -19,7 +23,8 @@ add_file () { echo $name $name git add $name test_tick - git commit -m Add $name || exit + msg_added_iso88591=$(echo Add $name ($added $name) | iconv -f utf-8 -t iso-8859-1) + git -c 'i18n.commitEncoding=iso-8859-1' commit -m $msg_added_iso88591 done /dev/null git rev-parse --short --verify HEAD ) @@ -89,29 +94,29 @@ test_expect_success 'diff.submodule does not affect plumbing' ' commit_file sm1 head2=$(add_file sm1 foo3) -test_expect_success 'modified submodule(forward)' ' +test_expect_failure 'modified submodule(forward)' ' git diff-index -p --submodule=log HEAD actual cat expected -EOF Submodule sm1 $head1..$head2: - Add foo3 + Add foo3 ($added foo3) EOF test_cmp expected actual ' -test_expect_success 'modified submodule(forward)' ' +test_expect_failure 'modified submodule(forward)' ' git diff --submodule=log actual cat expected -EOF Submodule sm1 $head1..$head2: - Add foo3 + Add foo3 ($added foo3) EOF test_cmp expected actual ' -test_expect_success 'modified submodule(forward) --submodule' ' +test_expect_failure 'modified submodule(forward) --submodule' ' git diff --submodule actual cat expected -EOF Submodule sm1 $head1..$head2: - Add foo3 + Add foo3 ($added foo3) EOF test_cmp expected actual ' @@ -138,25 +143,25 @@ head3=$( git rev-parse --short --verify HEAD ) -test_expect_success 'modified submodule(backward)' ' +test_expect_failure 'modified submodule(backward)' ' git diff-index -p --submodule=log HEAD actual cat expected -EOF Submodule sm1 $head2..$head3 (rewind): - Add foo3 - Add foo2 + Add foo3 ($added foo3) + Add foo2 ($added foo2) EOF test_cmp expected actual ' head4=$(add_file sm1 foo4 foo5) -test_expect_success 'modified submodule(backward and forward)' ' +test_expect_failure 'modified submodule(backward and forward)' ' git diff-index -p --submodule=log HEAD actual cat expected -EOF Submodule sm1 $head2...$head4: - Add foo5 - Add foo4 - Add foo3 - Add foo2 + Add foo5 ($added foo5) + Add foo4 ($added foo4) + Add foo3 ($added foo3) + Add foo2 ($added foo2) EOF test_cmp expected actual ' diff --git a/t/t4205-log-pretty-formats.sh b/t/t4205-log-pretty-formats.sh index