On 06/25/2012 07:17 AM, Petr Viktorin wrote:
The translation files we currently store in Git are full of redundant
information: source strings for untranslated messages, and file locations.
The first causes unnecessarily huge files. The second makes diffs
unreadable: when code is edited and line numbers change, metadata for
all messages shows up as changed. This makes reviewing translation
patches, and merging possible conflicts, hard -- it requires specialized

This patch changes the Makefile to strip the unneeded data from .po files.

Translators using Git must now run msgmerge (or, `make merge-po`) to get
.po files they can work with. Transifex users are unaffected, as the
source .pot file is not changed.

The i18n tests use file locations for producing nice error reports¹.
To make this work as before, the .pot is merged in before validation to
restore comments.
Currently this takes a noticeable amount of time, because polib uses a
particularly naïve algorithm for merging. I've sent a patch to polib to
resolve this; once that makes it downstream merging will be fast again.

Updating the translations with the new Makefile will cause a >5MB patch.
I don't want to pollute the mailing list with it, at least until the
Makefile patch is reviewed. It's available


¹ And for divining the programming language messages come from, but that
is only done on the .pot file, unaffected by this patch.

Good work and it's very close to getting an ACK.

There is now a discrepancy between what the Makefile thinks is the list of po files and the actual list of po files after running strip-po. This causes confusing errors.

I think the source of this problem is the Makefile has a list of po files in the variable $(po_files)

For starters why is:

        @for po_file in $$(ls *.po); do \

instead of:

        @for po_file in $(po_files); do \

If you run "make validate-po" before running "make strip-po" you get:

5 errors in 21 files

After stripping the po files "make validate-po" gives you:

14 errors in 21 files

The extra 9 errors are due to the fact validate-po is being asked to validate a non-existent po file which it considers an error (which I believe is a correct check).

"make msg-stats" gets confused for the same reason, it's asked to examine files that no longer exist.

"make mo-files" now fails catastrophically for the same reason, it's being asked to operate on files that don't exist.

In general large parts of the Makefile will now be confused or generate errors because the file list is incorrect.

Somehow we have to align the list of po files. That presents all sorts of interesting questions:

* does the list come from the LINQUAS file? (current method)

* does the list come from git? Doesn't work if you're not in a git development tree. This problem is easily seen when the RPM's are built. No file list can be generated because there is no git repo so you end up with 0 files being passed to the validation commands. Since validation is not critical when building RPM's this hasn't been a show stopper but it really needs to be fixed in some way at some point.

* does the list come from the current directory contents? What you did with strip-po, but that also has a potential for errors. What if someone deletes or adds a file in their tree by mistake?

* should "make strip-po" edit the LINGUAS file? (maybe the best solution). Maybe when it detects an empty file and removes it it should run a sed command to delete the line in LINGUAS?

It may not be evident from Makefile.in but over the years there has been competing strategies for how to get our list of files. Simo added the "git ls-files" strategy because he didn't want to have an explict list which had to be maintained (a valid concern) that still left us with the PY_EXPLICIT_FILES list, so how much did that really accomplish? Maybe PY_EXPLICIT_FILES can be removed in favor of a utility that tests the file type (or the hashbang interpreter line). But that still ties things to a git tree (ugh).

If you have any great ideas on how to address the file list issue it would be good to hear. However in the interim we have to somehow adjust the po file list after strip-po runs, once that's done I'm happy to ACK it.

I wouldn't be surprised if you responded "Well, the file list discrepancy only occurs when a maintainer is explicitly stripping po files and they should know they have to adjust the LINGUAS file therefore these confusing errors won't be seen by someone who would be confused by them". Maybe yes, maybe no. I can think of plenty of times I debugged some build/configure/make failure and groaned because it was in some area that was totally cryptic and unknown to me, took a long time to unravel and had a trivial adjustment fix that would have only been known to an expert in that part of the code.

But here are some other issues I saw when exercising the patch, not a problem with the patch but something that needs fixing.

"make validate-pot" produces 11 errors, it's because the pot file is old, we have to a "make update-pot" and commit it to git.

"make validate-po" shows 5 errors in es.po. I could have sworn I fixed these months ago. Maybe it only got committed to the 2.2. branch? We either have to fix the po file and push it to Transfix (but downloading a current TX version first!) or we have to go into TX and edit the offending msgstr's. Some of these errors are exactly the kind of thing which will cause us to throw an exception at run time if the locale is ES.

Anyway, in summary the patch is great, the idea is good, we just need to fix the file list somehow. And at some point the other two issues above will need some attention.

John Dennis <jden...@redhat.com>

Looking to carve out IT costs?

Freeipa-devel mailing list

Reply via email to