Re: [PATCH 12/12] fsck: mark strings for translation

2018-11-05 Thread Duy Nguyen
On Mon, Oct 29, 2018 at 12:53 PM SZEDER Gábor  wrote:
> The contents of 'out':
>
>   broken link fromtree be45bbd3809e0829297cefa576e699c134abacfd 
> (refs/heads/master@{1112912113}:caesar.t)
> toblob be45bbd3809e0829297cefa576e699c134abacfd 
> (refs/heads/master@{1112912113}:caesar.t)
>   missing blob be45bbd3809e0829297cefa576e699c134abacfd 
> (refs/heads/master@{1112912113}:caesar.t)
>
> On a related (side)note, I think it would be better if this 'grep'

I found the problem. Some function returns static string which works
fine when we do two printf()'s, one call of that function per
printf(). But when combining two printf() in one, we have a problem.
Thanks for catching.
-- 
Duy


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-31 Thread Jiang Xin
Ævar Arnfjörð Bjarmason  于2018年10月30日周二 上午1:43写道:
> On Mon, Oct 29 2018, Ævar Arnfjörð Bjarmason wrote:
> Unlike the rest of our stack where we need to support however many years
> old tools we can freely rely on bleeding-edge gettext features, since
> the only person we need to convince to upgrade is Jiang. I.e. he accepts
> the PRs from translators, (presumably) runs msgfmt --check and then
> submits the result to Junio.

I used a shell script to check commits before I send a pull request to Junio.
This script is in the po-helper branch, see:
https://github.com/git-l10n/git-po/tree/po-helper

It can catch unmatched '\n' errors (missing or unnecessary '\n').

$ git diff
   diff --git a/po/zh_CN.po b/po/zh_CN.po
   index eabd4d1f8e..6b0d9ebc60 100644
   --- a/po/zh_CN.po
   +++ b/po/zh_CN.po
   @@ -5157,7 +5157,7 @@ msgstr  "略过补丁 '%s'。"
#
#, perl-format
msgid   "Skipping %s with backup suffix '%s'.\n"
   -msgstr  "略过 %s 含备份后缀 '%s'。\n"
   +msgstr  "略过 %s 含备份后缀 '%s'。"
#
#, c-format
msgid   "Skipping repository %s\n"

   $ LC_ALL=C po-helper.sh check
   
   bg.po : 3958 translated messages.
   ca.po : 3328 translated messages, 18 fuzzy translations, 30
untranslated messages.
   de.po : 3958 translated messages.
   es.po : 3958 translated messages.
   fr.po : 3957 translated messages, 1 fuzzy translation.
   is.po : 14 translated messages.
   it.po : 716 translated messages, 350 untranslated messages.
   ko.po : 3608 translated messages.
   pt_PT.po  : 3198 translated messages.
   ru.po : 3366 translated messages, 594 untranslated messages.
   sv.po : 3958 translated messages.
   vi.po : 3958 translated messages.
   zh_CN.po  : po/zh_CN.po:19717: 'msgid' and 'msgstr' entries do not
both end with '\n'
   msgfmt: found 1 fatal error
   3958 translated messages.
   
   Show updates of git.pot...

   # Nothing changed. (run 'make pot' first)
   
   Check commits...

   0 commits checked complete.
   
   Note: If you want to check for upstream l10n update, run:
   Note:
   Note: po-helper.sh check update 
   

So, no warry about it. BTW, I agree with Jonathan.

> Jonathan said:
> IMHO the advantage of leaving the \n out in the message is not so much
> that we don't trust translators but more that where we can make their
> lives easier, we should.

--
Jiang Xin


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-30 Thread Jonathan Nieder
Hi,

Ævar Arnfjörð Bjarmason wrote:
>> On Mon, Oct 29, 2018 at 3:09 PM Junio C Hamano  wrote:
>>> SZEDER Gábor  writes:
 Nguyễn Thái Ngọc Duy wrote:

> -fprintf(stderr, "%s in %s %s: %s\n",
> -msg_type, printable_type(obj), describe_object(obj), err);
> +fprintf_ln(stderr, _("%s in %s %s: %s"),

 Are the (f)printf() -> (f)printf_ln() changes all over
 'builtin/fsck.c' really necessary to mark strings for translation?
>>>
>>> It is beyond absolute minimum but I saw it argued here that this
>>> makes it easier to manage the .po and .pot files if your message
>>> strings do not end with LF, a you are much less likely to _add_
>>> unneeded LF to the translated string than _lose_ LF at the end of
>>> translated string.
[...]
> As Jonathan pointed out in the follow-up message[1] this sort of thing
> is checked for in msgfmt, so sure, let's strip the \n, but it's really
> not something we need to worry about. Likewise with translators turning
> "%s" into "%d" (or "% s") or whatever.

IMHO the advantage of leaving the \n out in the message is not so much
that we don't trust translators but more that where we can make their
lives easier, we should.

In other words, I'm glad the patch does that, and Ævar, I agree.

Thanks, both.

Jonathan

> 1. 
> https://public-inbox.org/git/cacsjy8acuy9fziiehgc7mel4i+xp6u0pmh1rgor-wznhyt1...@mail.gmail.com/


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread Ævar Arnfjörð Bjarmason


On Mon, Oct 29 2018, Ævar Arnfjörð Bjarmason wrote:

> On Mon, Oct 29 2018, Duy Nguyen wrote:
>
>> On Mon, Oct 29, 2018 at 3:09 PM Junio C Hamano  wrote:
>>>
>>> SZEDER Gábor  writes:
>>>
>>> >> -fprintf(stderr, "%s in %s %s: %s\n",
>>> >> -msg_type, printable_type(obj), describe_object(obj), err);
>>> >> +fprintf_ln(stderr, _("%s in %s %s: %s"),
>>> >
>>> > Are the (f)printf() -> (f)printf_ln() changes all over
>>> > 'builtin/fsck.c' really necessary to mark strings for translation?
>>>
>>> It is beyond absolute minimum but I saw it argued here that this
>>> makes it easier to manage the .po and .pot files if your message
>>> strings do not end with LF, a you are much less likely to _add_
>>> unneeded LF to the translated string than _lose_ LF at the end of
>>> translated string.
>>
>> Especially when \n plays an important role and we don't trust
>> translators to keep it [1] [2]. It's probably a too defensive stance
>> and often does not apply, so nowadays I do it just to keep a
>> consistent pattern in the code.
>>
>> [1] https://public-inbox.org/git/20120308220131.GA10122@burratino/#t
>> [2] but then translators can crash programs anyway (e.g. changing %d
>> to %s...) we just trust gettext tools to catch problems early.
>
> As Jonathan pointed out in the follow-up message[1] this sort of thing
> is checked for in msgfmt, so sure, let's strip the \n, but it's really
> not something we need to worry about. Likewise with translators turning
> "%s" into "%d" (or "% s") or whatever.
>
> $ git diff -U0
> diff --git a/po/de.po b/po/de.po
> index 47986814c9..62de46c63d 100644
> --- a/po/de.po
> +++ b/po/de.po
> @@ -23 +23 @@ msgid "%shint: %.*s%s\n"
> -msgstr "%sHinweis: %.*s%s\n"
> +msgstr "%sHinweis: %.*s%s"
> $ make [...]
> [...]
> po/de.po:23: 'msgid' and 'msgstr' entries do not both end with '\n'
> msgfmt: found 1 fatal error
> 3470 translated messages.
> make: *** [Makefile:2488: po/build/locale/de/LC_MESSAGES/git.mo] Error 1
>
> 1. 
> https://public-inbox.org/git/cacsjy8acuy9fziiehgc7mel4i+xp6u0pmh1rgor-wznhyt1...@mail.gmail.com/

...to add to that, if there *are* formatting errors that --check doesn't
spot let's hear about it so we can just patch gettext upstream:
https://github.com/autotools-mirror/gettext/blob/master/gettext-tools/src/msgl-check.c#L572-L756

Unlike the rest of our stack where we need to support however many years
old tools we can freely rely on bleeding-edge gettext features, since
the only person we need to convince to upgrade is Jiang. I.e. he accepts
the PRs from translators, (presumably) runs msgfmt --check and then
submits the result to Junio.

See the "Usually..." paragraph in my 66f5f6dca9 ("C style: use standard
style for "TRANSLATORS" comments", 2017-05-11) for an example. We
already rely on a fairly recent gettext toolchain.


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread Ævar Arnfjörð Bjarmason


On Mon, Oct 29 2018, Duy Nguyen wrote:

> On Mon, Oct 29, 2018 at 3:09 PM Junio C Hamano  wrote:
>>
>> SZEDER Gábor  writes:
>>
>> >> -fprintf(stderr, "%s in %s %s: %s\n",
>> >> -msg_type, printable_type(obj), describe_object(obj), err);
>> >> +fprintf_ln(stderr, _("%s in %s %s: %s"),
>> >
>> > Are the (f)printf() -> (f)printf_ln() changes all over
>> > 'builtin/fsck.c' really necessary to mark strings for translation?
>>
>> It is beyond absolute minimum but I saw it argued here that this
>> makes it easier to manage the .po and .pot files if your message
>> strings do not end with LF, a you are much less likely to _add_
>> unneeded LF to the translated string than _lose_ LF at the end of
>> translated string.
>
> Especially when \n plays an important role and we don't trust
> translators to keep it [1] [2]. It's probably a too defensive stance
> and often does not apply, so nowadays I do it just to keep a
> consistent pattern in the code.
>
> [1] https://public-inbox.org/git/20120308220131.GA10122@burratino/#t
> [2] but then translators can crash programs anyway (e.g. changing %d
> to %s...) we just trust gettext tools to catch problems early.

As Jonathan pointed out in the follow-up message[1] this sort of thing
is checked for in msgfmt, so sure, let's strip the \n, but it's really
not something we need to worry about. Likewise with translators turning
"%s" into "%d" (or "% s") or whatever.

$ git diff -U0
diff --git a/po/de.po b/po/de.po
index 47986814c9..62de46c63d 100644
--- a/po/de.po
+++ b/po/de.po
@@ -23 +23 @@ msgid "%shint: %.*s%s\n"
-msgstr "%sHinweis: %.*s%s\n"
+msgstr "%sHinweis: %.*s%s"
$ make [...]
[...]
po/de.po:23: 'msgid' and 'msgstr' entries do not both end with '\n'
msgfmt: found 1 fatal error
3470 translated messages.
make: *** [Makefile:2488: po/build/locale/de/LC_MESSAGES/git.mo] Error 1

1. 
https://public-inbox.org/git/cacsjy8acuy9fziiehgc7mel4i+xp6u0pmh1rgor-wznhyt1...@mail.gmail.com/


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread Duy Nguyen
On Mon, Oct 29, 2018 at 3:09 PM Junio C Hamano  wrote:
>
> SZEDER Gábor  writes:
>
> >> -fprintf(stderr, "%s in %s %s: %s\n",
> >> -msg_type, printable_type(obj), describe_object(obj), err);
> >> +fprintf_ln(stderr, _("%s in %s %s: %s"),
> >
> > Are the (f)printf() -> (f)printf_ln() changes all over
> > 'builtin/fsck.c' really necessary to mark strings for translation?
>
> It is beyond absolute minimum but I saw it argued here that this
> makes it easier to manage the .po and .pot files if your message
> strings do not end with LF, a you are much less likely to _add_
> unneeded LF to the translated string than _lose_ LF at the end of
> translated string.

Especially when \n plays an important role and we don't trust
translators to keep it [1] [2]. It's probably a too defensive stance
and often does not apply, so nowadays I do it just to keep a
consistent pattern in the code.

[1] https://public-inbox.org/git/20120308220131.GA10122@burratino/#t
[2] but then translators can crash programs anyway (e.g. changing %d
to %s...) we just trust gettext tools to catch problems early.
-- 
Duy


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread Junio C Hamano
SZEDER Gábor  writes:

>> -fprintf(stderr, "%s in %s %s: %s\n",
>> -msg_type, printable_type(obj), describe_object(obj), err);
>> +fprintf_ln(stderr, _("%s in %s %s: %s"),
>
> Are the (f)printf() -> (f)printf_ln() changes all over
> 'builtin/fsck.c' really necessary to mark strings for translation?

It is beyond absolute minimum but I saw it argued here that this
makes it easier to manage the .po and .pot files if your message
strings do not end with LF, a you are much less likely to _add_
unneeded LF to the translated string than _lose_ LF at the end of
translated string.


Re: [PATCH 12/12] fsck: mark strings for translation

2018-10-29 Thread SZEDER Gábor
On Sun, Oct 28, 2018 at 07:51:57AM +0100, Nguyễn Thái Ngọc Duy wrote:
> Signed-off-by: Nguyễn Thái Ngọc Duy 
> ---
>  builtin/fsck.c | 113 -
>  t/t1410-reflog.sh  |   6 +-
>  t/t1450-fsck.sh|  50 
>  t/t6050-replace.sh |   4 +-
>  t/t7415-submodule-names.sh |   6 +-
>  5 files changed, 94 insertions(+), 85 deletions(-)
> 
> diff --git a/builtin/fsck.c b/builtin/fsck.c
> index 06eb421720..13f8fe35c5 100644
> --- a/builtin/fsck.c
> +++ b/builtin/fsck.c
> @@ -108,8 +108,9 @@ static int fsck_config(const char *var, const char 
> *value, void *cb)
>  static void objreport(struct object *obj, const char *msg_type,
>   const char *err)
>  {
> - fprintf(stderr, "%s in %s %s: %s\n",
> - msg_type, printable_type(obj), describe_object(obj), err);
> + fprintf_ln(stderr, _("%s in %s %s: %s"),

Are the (f)printf() -> (f)printf_ln() changes all over
'builtin/fsck.c' really necessary to mark strings for translation?

> diff --git a/t/t1450-fsck.sh b/t/t1450-fsck.sh
> index 90c765df3a..500c21366e 100755
> --- a/t/t1450-fsck.sh
> +++ b/t/t1450-fsck.sh

> @@ -594,7 +594,7 @@ test_expect_success 'fsck --name-objects' '
>   remove_object $(git rev-parse julius:caesar.t) &&
>   test_must_fail git fsck --name-objects >out &&
>   tree=$(git rev-parse --verify julius:) &&
> - egrep "$tree \((refs/heads/master|HEAD)@\{[0-9]*\}:" out
> + test -n "$GETTEXT_POISON" || egrep "$tree 
> \((refs/heads/master|HEAD)@\{[0-9]*\}:" out

'test_i18ngrep' accepts all 'grep' options, so this could be written
as:

  test_i18ngrep -E "..." out


There might be something else wrong with the patch, because now this
test tends to fail on current 'pu' on Travis CI:

  [... snipped output from 'test_commit' ...]
  +git rev-parse julius:caesar.t
  +remove_object be45bbd3809e0829297cefa576e699c134abacfd
  +sha1_file be45bbd3809e0829297cefa576e699c134abacfd
  +remainder=45bbd3809e0829297cefa576e699c134abacfd
  +firsttwo=be
  +echo .git/objects/be/45bbd3809e0829297cefa576e699c134abacfd
  +rm .git/objects/be/45bbd3809e0829297cefa576e699c134abacfd
  +test_must_fail git fsck --name-objects
  +_test_ok=
  +git fsck --name-objects
  +exit_code=2
  +test 2 -eq 0
  +test_match_signal 13 2
  +test 2 = 141
  +test 2 = 269
  +return 1
  +test 2 -gt 129
  +test 2 -eq 127
  +test 2 -eq 126
  +return 0
  +git rev-parse --verify julius:
  +tree=c2fab98f409a47394d992eca10a20e0b22377c0c
  +test -n 
  +egrep c2fab98f409a47394d992eca10a20e0b22377c0c 
\((refs/heads/master|HEAD)@\{[0-9]*\}: out
  error: last command exited with $?=1
  not ok 65 - fsck --name-objects

The contents of 'out':

  broken link fromtree be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)
toblob be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)
  missing blob be45bbd3809e0829297cefa576e699c134abacfd 
(refs/heads/master@{1112912113}:caesar.t)

On a related (side)note, I think it would be better if this 'grep'
pattern were more explicit about which of the three lines it is
supposed to match.


Anyway, I couldn't reproduce the failure locally yet, but, admittedly,
I didn't try that hard...  And my builds on Travis CI haven't yet come
that far to try this topic on its own.