Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-25 Thread John Dennis

On 07/24/2012 06:39 AM, Petr Viktorin wrote:

On 07/24/2012 01:12 AM, John Dennis wrote:

On 07/23/2012 06:27 AM, Petr Viktorin wrote:

As a translator (for another project), I don't like Transifex and prefer
to send good old Git pull requests. I understand a traditional
workflow is hard to coordinate with others that use Transifex, but still
I'd hate it if we became dependent on Tx.


For better or worse we are dependent on TX (Transifex). Fedora has
adopted TX as it's translation tool, RHEL's translation tools integrate
with TX (as well as other translation portals). And SSSD and IPA have
made a a commitment to TX based on the direction of Fedora and RHEL.

Given that we've adopted TX I don't see the value in maintaining tools
that support both TX and non-TX workflows. I'd rather see us delete the
non-TX elements. If we have just one workflow it's easier to understand
and maintain the code. If we ever decide we need to go back to a non-TX
workflow we can always retrieve the deleted code from git.



This means you have to be a member of a Fedora translation team to
translate.


Actually we're not using the Fedora TX instance, rather the 
transifex.net instance so I don't think we're limited to translators who 
are members of a Fedora translation team.


 It makes it harder for people to fork the project. A workflow

with a mandatory central repository makes it impossible to experiment
locally.
I'm all for having a standard way to receive contributions, but limiting
how people can create those contributions isn't good.

I'm all for deleting unused code, but here I think it would be a bad move.



Actually I don't have strong feelings about this one way or the other. 
My primary concern with two different workflows was that we have to test 
and maintain both and one of them is currently unused. My other concern 
is the added complexity, most developers and release engineers don't 
understand this stuff so keeping is simple to accommodate those less 
familiar with the process seemed like a win.


But you have a valid point about being flexible, so it's fine with me to 
keep the old code. We probably need better documentation.


John



--
John Dennis jden...@redhat.com

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-25 Thread John Dennis

On 07/24/2012 04:17 AM, Petr Viktorin wrote:

On 07/23/2012 10:46 PM, John Dennis wrote:
[...]


The only thing holding up the ACK is the question of why po-files now
has update_pot as a dependency.



If files simply depend on $(DOMAIN).pot, then they are considered
up-to-date even after they're changed (e.g. with strip-po). They need to
depend on a rule that always runs so that they get merged.

There's another alternative to achieve this: adding them to .PHONY. The
attached version does that, perhaps it's cleaner.




ACK, thanks for the good work.

--
John Dennis jden...@redhat.com

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-25 Thread Rob Crittenden

John Dennis wrote:

On 07/24/2012 04:17 AM, Petr Viktorin wrote:

On 07/23/2012 10:46 PM, John Dennis wrote:
[...]


The only thing holding up the ACK is the question of why po-files now
has update_pot as a dependency.



If files simply depend on $(DOMAIN).pot, then they are considered
up-to-date even after they're changed (e.g. with strip-po). They need to
depend on a rule that always runs so that they get merged.

There's another alternative to achieve this: adding them to .PHONY. The
attached version does that, perhaps it's cleaner.




ACK, thanks for the good work.



pushed to master

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-24 Thread Petr Viktorin

On 07/23/2012 10:46 PM, John Dennis wrote:
[...]


The only thing holding up the ACK is the question of why po-files now
has update_pot as a dependency.



If files simply depend on $(DOMAIN).pot, then they are considered 
up-to-date even after they're changed (e.g. with strip-po). They need to 
depend on a rule that always runs so that they get merged.


There's another alternative to achieve this: adding them to .PHONY. The 
attached version does that, perhaps it's cleaner.



--
Petr³


From 87d94d673a7647ffe508a11c985e76f575180971 Mon Sep 17 00:00:00 2001
From: Petr Viktorin pvikt...@redhat.com
Date: Wed, 20 Jun 2012 06:38:16 -0400
Subject: [PATCH] Arrange stripping .po files

The .po files we use for translations have two shortcomings when used in Git:
- They include file locations, which change each time the source is updated.
  This results in large, unreadable diffs that don't merge well.
- They include source strings for untranslated messages, wasting space
  unnecessarily.

Update the Makefile so that the extraneous information is stripped when the
files are updated or pulled form Transifex, and empty translation files are
removed entirely.
Also, translations are normalized to a common style. This should help diffs
and merges.

The validator requires file location comments to identify the programming
language, and to produce good error reports.
To make this work, merge the comments in before validation.

First patch for: https://fedorahosted.org/freeipa/ticket/2435
---
 install/configure.ac   |5 +
 install/po/Makefile.in |   22 --
 install/po/README  |   16 ++--
 tests/i18n.py  |   12 ++--
 4 files changed, 49 insertions(+), 6 deletions(-)

diff --git a/install/configure.ac b/install/configure.ac
index 827ddbab411a4aa8abbdd4488e217ce67046bd6b..9e781a684429191b3c5eb46aed4fceecc9be6586 100644
--- a/install/configure.ac
+++ b/install/configure.ac
@@ -48,6 +48,11 @@ if test x$MSGCMP = xno; then
 AC_MSG_ERROR([msgcmp not found, install gettext])
 fi
 
+AC_PATH_PROG(MSGATTRIB, msgattrib, [no])
+if test x$MSGATTRIB = xno; then
+AC_MSG_ERROR([msgattrib not found, install gettext])
+fi
+
 AC_PATH_PROG(TX, tx, [/usr/bin/tx])
 
 AC_ARG_WITH([gettext_domain],
diff --git a/install/po/Makefile.in b/install/po/Makefile.in
index 9a3dde78a20a6beb35ab08230331f28b7ea3161d..bc91a933b9e10e4178cb4190e62140549da06591 100644
--- a/install/po/Makefile.in
+++ b/install/po/Makefile.in
@@ -14,6 +14,7 @@ MSGFMT = @MSGFMT@
 MSGINIT = @MSGINIT@
 MSGMERGE = @MSGMERGE@
 MSGCMP = @MSGCMP@
+MSGATTRIB = @MSGATTRIB@
 TX = @TX@
 IPA_TEST_I18N = ../../tests/i18n.py
 
@@ -67,7 +68,7 @@ C_POTFILES = $(C_FILES) $(H_FILES)
 
 .SUFFIXES:
 .SUFFIXES: .po .mo
-.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug
+.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug strip-po merge-po $(po_files)
 
 all:
 
@@ -86,6 +87,19 @@ $(po_files): $(DOMAIN).pot
 	echo Merging $(DOMAIN).pot into $@; \
 	$(MSGMERGE) --no-fuzzy-matching -o $@ $@ $(DOMAIN).pot
 
+strip-po:
+	@for po_file in $(po_files); do \
+		echo Stripping $$po_file; \
+		$(MSGATTRIB) --translated --no-fuzzy --no-location $$po_file  $$po_file.tmp; \
+		mv $$po_file.tmp $$po_file; \
+	done
+	@export FILES_TO_REMOVE=`find . -name '*.po' -empty`; \
+	if [ $$FILES_TO_REMOVE !=  ]; then \
+		echo Removing empty translation files; \
+		rm -v $$FILES_TO_REMOVE; \
+		echo; echo Please remove the deleted files from LINGUAS!; echo; \
+	fi
+
 create-po: $(DOMAIN).pot
 	@for po_file in $(po_files); do \
 	if [ ! -e $$po_file ]; then \
@@ -98,10 +112,14 @@ create-po: $(DOMAIN).pot
 
 pull-po:
 	cd ../..; $(TX) pull -f
+	$(MAKE) strip-po
 
-update-po: update-pot
+merge-po: update-pot
 	$(MAKE) $(po_files)
 
+update-po: merge-po
+	$(MAKE) strip-po
+
 update-pot:
 	@rm -f $(DOMAIN).pot.update
 	@pushd ../.. ; \
diff --git a/install/po/README b/install/po/README
index ada7df40e3f294b204a5d44c267ee57ebe734042..6894a06337fac68675cb1a852ca828c54da74f96 100644
--- a/install/po/README
+++ b/install/po/README
@@ -6,28 +6,40 @@ A: Edit Makefile.in and add the source file to the appropriate *_POTFILES list.
NOTE: Now this i only necessary for python files that lack the .py
  extension. All .py, .c and .h files are automatically sourced.
 
+Q: Untranslated strings and file locations are missing from my .po file.
+   How do I add them?
+
+A: make merge-po
+   Untranslated strings are left out of the files in SCM. The merge-po command
+   runs msgmerge to add them again.
+
 Q: How do I pick up new strings to translate from the source files after the
source have been modified?
 
-A: make update-po
+A: make merge-po
This regenerates the pot template file by scanning all the source files.
Then the new strings are merged into each .po file from the new pot file.
 
 Q: How do I just regenerate the pot template file without 

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-24 Thread Petr Viktorin

On 07/24/2012 01:12 AM, John Dennis wrote:

On 07/23/2012 06:27 AM, Petr Viktorin wrote:

As a translator (for another project), I don't like Transifex and prefer
to send good old Git pull requests. I understand a traditional
workflow is hard to coordinate with others that use Transifex, but still
I'd hate it if we became dependent on Tx.


For better or worse we are dependent on TX (Transifex). Fedora has
adopted TX as it's translation tool, RHEL's translation tools integrate
with TX (as well as other translation portals). And SSSD and IPA have
made a a commitment to TX based on the direction of Fedora and RHEL.

Given that we've adopted TX I don't see the value in maintaining tools
that support both TX and non-TX workflows. I'd rather see us delete the
non-TX elements. If we have just one workflow it's easier to understand
and maintain the code. If we ever decide we need to go back to a non-TX
workflow we can always retrieve the deleted code from git.



This means you have to be a member of a Fedora translation team to 
translate. It makes it harder for people to fork the project. A workflow 
with a mandatory central repository makes it impossible to experiment 
locally.
I'm all for having a standard way to receive contributions, but limiting 
how people can create those contributions isn't good.


I'm all for deleting unused code, but here I think it would be a bad move.


--
Petr³


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-23 Thread Petr Viktorin

On 07/20/2012 07:14 PM, John Dennis wrote:

On 07/20/2012 12:28 PM, Petr Viktorin wrote:

On 07/20/2012 05:39 PM, John Dennis wrote:

Great I agree with everything you said.

I'm happy to have the file list be derived from the directory contents.
Are you planning on doing that in another patch?


Yes, I want to do it in a new patch.
It's a bit more complicated than it looks: creating a new translation
will work differently than just adding it to LINGUAS and running
create-po. The ticket is for beta 2 so I'd rather not start a new round
of reviews.


Fine with me to do that in another patch.

As for create-po, I think that's also holdover from pre-Transifex days.
With Transifex I'd don't ever see a need to create an empty po file. Do
you? Maybe we should just nuke the po creation in the Makefile.


As a translator (for another project), I don't like Transifex and prefer 
to send good old Git pull requests. I understand a traditional 
workflow is hard to coordinate with others that use Transifex, but still 
I'd hate it if we became dependent on Tx.


[...]

But... I do have one final issue/question. I missed this in the first
review. po_files is now dependent on update-pot instead of the pot file.
We had decided that we were only going to regenerate the pot file on
demand at specific times. Won't this dependency change cause the pot
file to be updated frequently? (I realize only in the local tree). Note
that when we run the validations we generate a temporary pot file from
the current contents of the tree specifically to avoid overwriting the
pot file.


Are the po files updated more often? I don't really see a reason to
merge the po files with an old pot.


What merge are you referring to? The only merge I'm aware of at the
moment is during validation, but that merge is done from a temporary
updated pot file that is current with the tree.


I'm referring to a manual merge-po.
po_files are only rebuilt from merge-po, which merges the po files with 
the pot and adds all the missing translations and line numbers. This is 
not needed with Transifex workflow, as Tx should do this internally when 
a pot is pushed to it.



Any other merging is done by Transifex at the time we pull a po file.

The frequency of po update doesn't seem relevant, what is your concern
in this regard?


Is there another cause for the po_files to get rebuilt?


I suppose having a conversation about when the pot file gets updated is
a good one to have, we don't do it often enough IMHO. But I'm not sure
it's correct to modify a file under SCM control if it wasn't
intentional.


How is Transifex set up here? If it automatically picks up changes when
the pot file is modified, then we should back up the translations before
changing the pot, so we can't do it automatically.
Another wart is the line number cruft in the pot file -- any time it's
updated we'll get a huge diff, so it makes sense to update sparingly.


Transifex gives you two options for your pot file, either you tell TX
the location of your pot file in a public SCM and it watches for updates
and automatically pulls it when it changes in SCM -or- you manually push
the pot file to TX.

We've been using the watch the pot file in git option. Thus whenever
we commit a new version of the pot file all developers and TX get it
simultaneously (well sort of).

If we do the manual push method the maintainer has to *both* commit to
git *and* push to TX, so the former seems less error prone and more
automated.


Well, if the pot file is not in the repo, the maintainer only has to 
push it to Tx (after building it of course, but that needs to be done 
anyway).



The idea was we would have a string freeze prior to release and/or
periodic intervals during branch development to update the pot. But we
haven't been good about hitting these. However, note a manual push
suffers from the same somebody has to do it at the right moment problem.


Is this idea documented anywhere? It's hard to do a string freeze if 
it's not enforced automatically, let alone if people don't know there 
should be one.



If Transifex is not wired to the pot, we could even go as far as
removing it from SCM entirely -- it's entirely generated, and rebuilding
it takes less than a second.
We'd just have to update Transifex manually.


It currently is wired to the pot. You make a valid point about currently
not needing to maintain the pot in SCM. When we first set up
translations we weren't using TX so having the pot file in SCM was a
necessity.

Personally I don't trust TX's data storage and I think there is value in
having each pot we push to TX be recoverable from our SCM. When things
blow up (and they do) it's really nice to be able to reassemble the
pieces or at lease follow the trail of how things changed. In the past
I've had to answer questions like How the heck did this string get into
this po file? Such questions can only be answered if we have the pot
file we gave the translator. TX doesn't maintain it so we have 

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-23 Thread John Dennis

On 07/23/2012 06:27 AM, Petr Viktorin wrote:

On 07/20/2012 07:14 PM, John Dennis wrote:

On 07/20/2012 12:28 PM, Petr Viktorin wrote:

On 07/20/2012 05:39 PM, John Dennis wrote:

Great I agree with everything you said.

I'm happy to have the file list be derived from the directory contents.
Are you planning on doing that in another patch?


Yes, I want to do it in a new patch.
It's a bit more complicated than it looks: creating a new translation
will work differently than just adding it to LINGUAS and running
create-po. The ticket is for beta 2 so I'd rather not start a new round
of reviews.


Fine with me to do that in another patch.

As for create-po, I think that's also holdover from pre-Transifex days.
With Transifex I'd don't ever see a need to create an empty po file. Do
you? Maybe we should just nuke the po creation in the Makefile.


As a translator (for another project), I don't like Transifex and prefer
to send good old Git pull requests. I understand a traditional
workflow is hard to coordinate with others that use Transifex, but still
I'd hate it if we became dependent on Tx.

[...]

But... I do have one final issue/question. I missed this in the first
review. po_files is now dependent on update-pot instead of the pot file.
We had decided that we were only going to regenerate the pot file on
demand at specific times. Won't this dependency change cause the pot
file to be updated frequently? (I realize only in the local tree). Note
that when we run the validations we generate a temporary pot file from
the current contents of the tree specifically to avoid overwriting the
pot file.


Are the po files updated more often? I don't really see a reason to
merge the po files with an old pot.


What merge are you referring to? The only merge I'm aware of at the
moment is during validation, but that merge is done from a temporary
updated pot file that is current with the tree.


I'm referring to a manual merge-po.
po_files are only rebuilt from merge-po, which merges the po files with
the pot and adds all the missing translations and line numbers. This is
not needed with Transifex workflow, as Tx should do this internally when
a pot is pushed to it.


Any other merging is done by Transifex at the time we pull a po file.

The frequency of po update doesn't seem relevant, what is your concern
in this regard?


Is there another cause for the po_files to get rebuilt?


Using the TX model there is never a reason to build po files. We just 
pull them from TX.





I suppose having a conversation about when the pot file gets updated is
a good one to have, we don't do it often enough IMHO. But I'm not sure
it's correct to modify a file under SCM control if it wasn't
intentional.


How is Transifex set up here? If it automatically picks up changes when
the pot file is modified, then we should back up the translations before
changing the pot, so we can't do it automatically.
Another wart is the line number cruft in the pot file -- any time it's
updated we'll get a huge diff, so it makes sense to update sparingly.


Transifex gives you two options for your pot file, either you tell TX
the location of your pot file in a public SCM and it watches for updates
and automatically pulls it when it changes in SCM -or- you manually push
the pot file to TX.

We've been using the watch the pot file in git option. Thus whenever
we commit a new version of the pot file all developers and TX get it
simultaneously (well sort of).

If we do the manual push method the maintainer has to *both* commit to
git *and* push to TX, so the former seems less error prone and more
automated.


Well, if the pot file is not in the repo, the maintainer only has to
push it to Tx (after building it of course, but that needs to be done
anyway).


The idea was we would have a string freeze prior to release and/or
periodic intervals during branch development to update the pot. But we
haven't been good about hitting these. However, note a manual push
suffers from the same somebody has to do it at the right moment problem.


Is this idea documented anywhere? It's hard to do a string freeze if
it's not enforced automatically, let alone if people don't know there
should be one.


It was discussed in the developer conference calls.




If Transifex is not wired to the pot, we could even go as far as
removing it from SCM entirely -- it's entirely generated, and rebuilding
it takes less than a second.
We'd just have to update Transifex manually.


It currently is wired to the pot. You make a valid point about currently
not needing to maintain the pot in SCM. When we first set up
translations we weren't using TX so having the pot file in SCM was a
necessity.

Personally I don't trust TX's data storage and I think there is value in
having each pot we push to TX be recoverable from our SCM. When things
blow up (and they do) it's really nice to be able to reassemble the
pieces or at lease follow the trail of how things changed. In the past
I've had to answer 

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-23 Thread John Dennis

On 07/23/2012 06:27 AM, Petr Viktorin wrote:

As a translator (for another project), I don't like Transifex and prefer
to send good old Git pull requests. I understand a traditional
workflow is hard to coordinate with others that use Transifex, but still
I'd hate it if we became dependent on Tx.


For better or worse we are dependent on TX (Transifex). Fedora has 
adopted TX as it's translation tool, RHEL's translation tools integrate 
with TX (as well as other translation portals). And SSSD and IPA have 
made a a commitment to TX based on the direction of Fedora and RHEL.


Given that we've adopted TX I don't see the value in maintaining tools 
that support both TX and non-TX workflows. I'd rather see us delete the 
non-TX elements. If we have just one workflow it's easier to understand 
and maintain the code. If we ever decide we need to go back to a non-TX 
workflow we can always retrieve the deleted code from git.


--
John Dennis jden...@redhat.com

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-20 Thread Petr Viktorin

On 07/19/2012 10:52 PM, John Dennis wrote:

On 06/25/2012 07:17 AM, Petr Viktorin wrote:

The translation files we currently store in Git are full of redundant
information: source strings for untranslated messages, and file
locations.
The first causes unnecessarily huge files. The second makes diffs
unreadable: when code is edited and line numbers change, metadata for
all messages shows up as changed. This makes reviewing translation
patches, and merging possible conflicts, hard -- it requires specialized
tools.

This patch changes the Makefile to strip the unneeded data from .po
files.

Translators using Git must now run msgmerge (or, `make merge-po`) to get
.po files they can work with. Transifex users are unaffected, as the
source .pot file is not changed.

The i18n tests use file locations for producing nice error reports¹.
To make this work as before, the .pot is merged in before validation to
restore comments.
Currently this takes a noticeable amount of time, because polib uses a
particularly naïve algorithm for merging. I've sent a patch to polib to
resolve this; once that makes it downstream merging will be fast again.

Updating the translations with the new Makefile will cause a 5MB patch.
I don't want to pollute the mailing list with it, at least until the
Makefile patch is reviewed. It's available
https://github.com/encukou/freeipa/commit/65e2e4.patch


https://fedorahosted.org/freeipa/ticket/2435


--
¹ And for divining the programming language messages come from, but that
is only done on the .pot file, unaffected by this patch.


Good work and it's very close to getting an ACK.

There is now a discrepancy between what the Makefile thinks is the list
of po files and the actual list of po files after running strip-po. This
causes confusing errors.

I think the source of this problem is the Makefile has a list of po
files in the variable $(po_files)

For starters why is:

strip-po:
 @for po_file in $$(ls *.po); do \

instead of:

strip-po:
 @for po_file in $(po_files); do \


Good catch, I'll update it to be consistent with the status quo.
But see below.


If you run make validate-po before running make strip-po you get:

5 errors in 21 files

After stripping the po files make validate-po gives you:

14 errors in 21 files


I left updating the files to a subsequent patch 
(https://github.com/encukou/freeipa/commit/65e2e4.patch); the LINGUAS 
update is part of that.



The extra 9 errors are due to the fact validate-po is being asked to
validate a non-existent po file which it considers an error (which I
believe is a correct check).

make msg-stats gets confused for the same reason, it's asked to
examine files that no longer exist.

make mo-files now fails catastrophically for the same reason, it's
being asked to operate on files that don't exist.

In general large parts of the Makefile will now be confused or generate
errors because the file list is incorrect.


Somehow we have to align the list of po files. That presents all sorts
of interesting questions:

* does the list come from the LINQUAS file? (current method)

* does the list come from git? Doesn't work if you're not in a git
development tree. This problem is easily seen when the RPM's are built.
No file list can be generated because there is no git repo so you end up
with 0 files being passed to the validation commands. Since validation
is not critical when building RPM's this hasn't been a show stopper but
it really needs to be fixed in some way at some point.


I agree that tying ourselves to Git isn't a nice thing to do. I know I 
am never happy when I can't compile some project in Mercurial after 
importing it to Git :)


If we use the ls-files strategy then that should at least write the list 
to a version-controlled file, which we fall back to in case we're not in 
a git tree.



* does the list come from the current directory contents? What you did
with strip-po, but that also has a potential for errors. What if someone
deletes or adds a file in their tree by mistake?


I personally would do this -- the most straightforward way to do it. If 
someone adds or deletes a file by mistake, a `git status` will reveal 
it. We could have a sanity check that refuses to build if there is a 
discrepancy between Git and the working tree (of course outside of a Git 
repo it would just warn).


There's one more reason for going with directory contents: when you're 
pulling from Transifex or otherwise adding/removing the translation 
files, you have to carefully keep LINGUAS in sync with the tree, 
otherwise the tools can either blow up or do too little. Debugging that 
could be frustrating. Having the tools look in the directory itself, and 
only doing sanity checking at a point where everything should be in 
order, should make everything easier.



* should make strip-po edit the LINGUAS file? (maybe the best
solution). Maybe when it detects an empty file and removes it it should
run a sed command to delete the line in LINGUAS?



Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-20 Thread John Dennis

Great I agree with everything you said.

I'm happy to have the file list be derived from the directory contents. 
Are you planning on doing that in another patch?


FWIW the LINGUAS file was a holdover from when we first set this up 
based exclusively on GNU gettext suggested examples. As things have 
evolved it no longer makes sense. Also the contributing translators file 
is now out of date and was from an earlier era when translators emailed 
.po files to us, so it was easy to maintain. Now that everything is TX 
based we should probably nuke that file or figure out someway to extract 
the contributors from either TX or the po files. I'm not sure we're even 
giving credit to the translators anymore, but we should.


But... I do have one final issue/question. I missed this in the first 
review. po_files is now dependent on update-pot instead of the pot file. 
We had decided that we were only going to regenerate the pot file on 
demand at specific times. Won't this dependency change cause the pot 
file to be updated frequently? (I realize only in the local tree). Note 
that when we run the validations we generate a temporary pot file from 
the current contents of the tree specifically to avoid overwriting the 
pot file.


I suppose having a conversation about when the pot file gets updated is 
a good one to have, we don't do it often enough IMHO. But I'm not sure 
it's correct to modify a file under SCM control if it wasn't intentional.


--
John Dennis jden...@redhat.com

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/


___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-20 Thread John Dennis

On 07/20/2012 12:28 PM, Petr Viktorin wrote:

On 07/20/2012 05:39 PM, John Dennis wrote:

Great I agree with everything you said.

I'm happy to have the file list be derived from the directory contents.
Are you planning on doing that in another patch?


Yes, I want to do it in a new patch.
It's a bit more complicated than it looks: creating a new translation
will work differently than just adding it to LINGUAS and running
create-po. The ticket is for beta 2 so I'd rather not start a new round
of reviews.


Fine with me to do that in another patch.

As for create-po, I think that's also holdover from pre-Transifex days. 
With Transifex I'd don't ever see a need to create an empty po file. Do 
you? Maybe we should just nuke the po creation in the Makefile.





FWIW the LINGUAS file was a holdover from when we first set this up
based exclusively on GNU gettext suggested examples. As things have
evolved it no longer makes sense. Also the contributing translators file
is now out of date and was from an earlier era when translators emailed
.po files to us, so it was easy to maintain. Now that everything is TX
based we should probably nuke that file or figure out someway to extract
the contributors from either TX or the po files. I'm not sure we're even
giving credit to the translators anymore, but we should.


Noted; when the discussion's done I'll file a ticket.


But... I do have one final issue/question. I missed this in the first
review. po_files is now dependent on update-pot instead of the pot file.
We had decided that we were only going to regenerate the pot file on
demand at specific times. Won't this dependency change cause the pot
file to be updated frequently? (I realize only in the local tree). Note
that when we run the validations we generate a temporary pot file from
the current contents of the tree specifically to avoid overwriting the
pot file.


Are the po files updated more often? I don't really see a reason to
merge the po files with an old pot.


What merge are you referring to? The only merge I'm aware of at the 
moment is during validation, but that merge is done from a temporary 
updated pot file that is current with the tree.


Any other merging is done by Transifex at the time we pull a po file.

The frequency of po update doesn't seem relevant, what is your concern 
in this regard?





I suppose having a conversation about when the pot file gets updated is
a good one to have, we don't do it often enough IMHO. But I'm not sure
it's correct to modify a file under SCM control if it wasn't intentional.


How is Transifex set up here? If it automatically picks up changes when
the pot file is modified, then we should back up the translations before
changing the pot, so we can't do it automatically.
Another wart is the line number cruft in the pot file -- any time it's
updated we'll get a huge diff, so it makes sense to update sparingly.


Transifex gives you two options for your pot file, either you tell TX 
the location of your pot file in a public SCM and it watches for updates 
and automatically pulls it when it changes in SCM -or- you manually push 
the pot file to TX.


We've been using the watch the pot file in git option. Thus whenever 
we commit a new version of the pot file all developers and TX get it 
simultaneously (well sort of).


If we do the manual push method the maintainer has to *both* commit to 
git *and* push to TX, so the former seems less error prone and more 
automated.


The idea was we would have a string freeze prior to release and/or 
periodic intervals during branch development to update the pot. But we 
haven't been good about hitting these. However, note a manual push 
suffers from the same somebody has to do it at the right moment problem.




If Transifex is not wired to the pot, we could even go as far as
removing it from SCM entirely -- it's entirely generated, and rebuilding
it takes less than a second.
We'd just have to update Transifex manually.


It currently is wired to the pot. You make a valid point about currently 
not needing to maintain the pot in SCM. When we first set up 
translations we weren't using TX so having the pot file in SCM was a 
necessity.


Personally I don't trust TX's data storage and I think there is value in 
having each pot we push to TX be recoverable from our SCM. When things 
blow up (and they do) it's really nice to be able to reassemble the 
pieces or at lease follow the trail of how things changed. In the past 
I've had to answer questions like How the heck did this string get into 
this po file? Such questions can only be answered if we have the pot 
file we gave the translator. TX doesn't maintain it so we have to (or at 
least I think there is value in it).


Perhaps you can read between the lines and detect I don't view TX as the 
epitome of stability and robustness. It's still young and they are still 
adding features and changing how it works (kinda like IPA :-)



Oh, one thing I'll ask about: the Makefile is 

Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-19 Thread John Dennis

On 06/25/2012 07:17 AM, Petr Viktorin wrote:

The translation files we currently store in Git are full of redundant
information: source strings for untranslated messages, and file locations.
The first causes unnecessarily huge files. The second makes diffs
unreadable: when code is edited and line numbers change, metadata for
all messages shows up as changed. This makes reviewing translation
patches, and merging possible conflicts, hard -- it requires specialized
tools.

This patch changes the Makefile to strip the unneeded data from .po files.

Translators using Git must now run msgmerge (or, `make merge-po`) to get
.po files they can work with. Transifex users are unaffected, as the
source .pot file is not changed.

The i18n tests use file locations for producing nice error reports¹.
To make this work as before, the .pot is merged in before validation to
restore comments.
Currently this takes a noticeable amount of time, because polib uses a
particularly naïve algorithm for merging. I've sent a patch to polib to
resolve this; once that makes it downstream merging will be fast again.

Updating the translations with the new Makefile will cause a 5MB patch.
I don't want to pollute the mailing list with it, at least until the
Makefile patch is reviewed. It's available
https://github.com/encukou/freeipa/commit/65e2e4.patch


https://fedorahosted.org/freeipa/ticket/2435


--
¹ And for divining the programming language messages come from, but that
is only done on the .pot file, unaffected by this patch.


Good work and it's very close to getting an ACK.

There is now a discrepancy between what the Makefile thinks is the list 
of po files and the actual list of po files after running strip-po. This 
causes confusing errors.


I think the source of this problem is the Makefile has a list of po 
files in the variable $(po_files)


For starters why is:

strip-po:
@for po_file in $$(ls *.po); do \

instead of:

strip-po:
@for po_file in $(po_files); do \

If you run make validate-po before running make strip-po you get:

5 errors in 21 files

After stripping the po files make validate-po gives you:

14 errors in 21 files

The extra 9 errors are due to the fact validate-po is being asked to 
validate a non-existent po file which it considers an error (which I 
believe is a correct check).


make msg-stats gets confused for the same reason, it's asked to 
examine files that no longer exist.


make mo-files now fails catastrophically for the same reason, it's 
being asked to operate on files that don't exist.


In general large parts of the Makefile will now be confused or generate 
errors because the file list is incorrect.



Somehow we have to align the list of po files. That presents all sorts 
of interesting questions:


* does the list come from the LINQUAS file? (current method)

* does the list come from git? Doesn't work if you're not in a git 
development tree. This problem is easily seen when the RPM's are built. 
No file list can be generated because there is no git repo so you end up 
with 0 files being passed to the validation commands. Since validation 
is not critical when building RPM's this hasn't been a show stopper but 
it really needs to be fixed in some way at some point.


* does the list come from the current directory contents? What you did 
with strip-po, but that also has a potential for errors. What if someone 
deletes or adds a file in their tree by mistake?


* should make strip-po edit the LINGUAS file? (maybe the best 
solution). Maybe when it detects an empty file and removes it it should 
run a sed command to delete the line in LINGUAS?


It may not be evident from Makefile.in but over the years there has been 
competing strategies for how to get our list of files. Simo added the 
git ls-files strategy because he didn't want to have an explict list 
which had to be maintained (a valid concern) that still left us with the 
PY_EXPLICIT_FILES list, so how much did that really accomplish? Maybe 
PY_EXPLICIT_FILES can be removed in favor of a utility that tests the 
file type (or the hashbang interpreter line). But that still ties things 
to a git tree (ugh).


If you have any great ideas on how to address the file list issue it 
would be good to hear. However in the interim we have to somehow adjust 
the po file list after strip-po runs, once that's done I'm happy to ACK it.


I wouldn't be surprised if you responded Well, the file list 
discrepancy only occurs when a maintainer is explicitly stripping po 
files and they should know they have to adjust the LINGUAS file 
therefore these confusing errors won't be seen by someone who would be 
confused by them. Maybe yes, maybe no. I can think of plenty of times I 
debugged some build/configure/make failure and groaned because it was in 
some area that was totally cryptic and unknown to me, took a long time 
to unravel and had a trivial adjustment fix that would have only been 
known to an expert in that part of the code.



Re: [Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-07-11 Thread Petr Viktorin

On 06/25/2012 01:17 PM, Petr Viktorin wrote:

The translation files we currently store in Git are full of redundant
information: source strings for untranslated messages, and file locations.
The first causes unnecessarily huge files. The second makes diffs
unreadable: when code is edited and line numbers change, metadata for
all messages shows up as changed. This makes reviewing translation
patches, and merging possible conflicts, hard -- it requires specialized
tools.

This patch changes the Makefile to strip the unneeded data from .po files.

Translators using Git must now run msgmerge (or, `make merge-po`) to get
.po files they can work with. Transifex users are unaffected, as the
source .pot file is not changed.

The i18n tests use file locations for producing nice error reports¹.
To make this work as before, the .pot is merged in before validation to
restore comments.
Currently this takes a noticeable amount of time, because polib uses a
particularly naïve algorithm for merging. I've sent a patch to polib to
resolve this; once that makes it downstream merging will be fast again.

Updating the translations with the new Makefile will cause a 5MB patch.
I don't want to pollute the mailing list with it, at least until the
Makefile patch is reviewed. It's available
https://github.com/encukou/freeipa/commit/65e2e4.patch


https://fedorahosted.org/freeipa/ticket/2435



Could someone (John?) take some time to look at the patch?
I'll be away from office, returning on Tuesday 17th before the beta. It 
would be nice to have a review when I return.


--
Petr³

___
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel


[Freeipa-devel] [PATCH] 0066 Arrange stripping .po files

2012-06-25 Thread Petr Viktorin
The translation files we currently store in Git are full of redundant 
information: source strings for untranslated messages, and file locations.
The first causes unnecessarily huge files. The second makes diffs 
unreadable: when code is edited and line numbers change, metadata for 
all messages shows up as changed. This makes reviewing translation 
patches, and merging possible conflicts, hard -- it requires specialized 
tools.


This patch changes the Makefile to strip the unneeded data from .po files.

Translators using Git must now run msgmerge (or, `make merge-po`) to get 
.po files they can work with. Transifex users are unaffected, as the 
source .pot file is not changed.


The i18n tests use file locations for producing nice error reports¹.
To make this work as before, the .pot is merged in before validation to 
restore comments.
Currently this takes a noticeable amount of time, because polib uses a 
particularly naïve algorithm for merging. I've sent a patch to polib to 
resolve this; once that makes it downstream merging will be fast again.


Updating the translations with the new Makefile will cause a 5MB patch. 
I don't want to pollute the mailing list with it, at least until the 
Makefile patch is reviewed. It's available 
https://github.com/encukou/freeipa/commit/65e2e4.patch



https://fedorahosted.org/freeipa/ticket/2435


--
¹ And for divining the programming language messages come from, but that 
is only done on the .pot file, unaffected by this patch.


--
Petr³

From 16b20b737225908311f98e55db0938515e1abad6 Mon Sep 17 00:00:00 2001
From: Petr Viktorin pvikt...@redhat.com
Date: Wed, 20 Jun 2012 06:38:16 -0400
Subject: [PATCH] Arrange stripping .po files

The .po files we use for translations have two shortcomings when used in Git:
- They include file locations, which change each time the source is updated.
  This results in large, unreadable diffs that don't merge well.
- They include source strings for untranslated messages, wasting space
  unnecessarily.

Update the Makefile so that the extraneous information is stripped when the
files are updated or pulled form Transifex, and empty translation files are
removed entirely.
Also, translations are normalized to a common style. This should help diffs
and merges.

The validator requires file location comments to identify the programming
language, and to produce good error reports.
To make this work, merge the comments in before validation.

First patch for: https://fedorahosted.org/freeipa/ticket/2435
---
 install/configure.ac   |5 +
 install/po/Makefile.in |   20 +---
 install/po/README  |   16 ++--
 tests/i18n.py  |   12 ++--
 4 files changed, 46 insertions(+), 7 deletions(-)

diff --git a/install/configure.ac b/install/configure.ac
index 827ddbab411a4aa8abbdd4488e217ce67046bd6b..9e781a684429191b3c5eb46aed4fceecc9be6586 100644
--- a/install/configure.ac
+++ b/install/configure.ac
@@ -48,6 +48,11 @@ if test x$MSGCMP = xno; then
 AC_MSG_ERROR([msgcmp not found, install gettext])
 fi
 
+AC_PATH_PROG(MSGATTRIB, msgattrib, [no])
+if test x$MSGATTRIB = xno; then
+AC_MSG_ERROR([msgattrib not found, install gettext])
+fi
+
 AC_PATH_PROG(TX, tx, [/usr/bin/tx])
 
 AC_ARG_WITH([gettext_domain],
diff --git a/install/po/Makefile.in b/install/po/Makefile.in
index 9a3dde78a20a6beb35ab08230331f28b7ea3161d..c1a9bc8b8962fa2f9c7ff2bf541f5996e34a642f 100644
--- a/install/po/Makefile.in
+++ b/install/po/Makefile.in
@@ -14,6 +14,7 @@ MSGFMT = @MSGFMT@
 MSGINIT = @MSGINIT@
 MSGMERGE = @MSGMERGE@
 MSGCMP = @MSGCMP@
+MSGATTRIB = @MSGATTRIB@
 TX = @TX@
 IPA_TEST_I18N = ../../tests/i18n.py
 
@@ -67,25 +68,34 @@ C_POTFILES = $(C_FILES) $(H_FILES)
 
 .SUFFIXES:
 .SUFFIXES: .po .mo
-.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug
+.PHONY: all create-po update-po update-pot install mostlyclean clean distclean test mo-files debug strip-po merge-po
 
 all:
 
 SUFFIXES = .po .mo
 
 .po.mo:
 	@echo Creating $@; \
 	$(MSGFMT) -c -o t-$@ $  mv t-$@ $@
 
-$(po_files): $(DOMAIN).pot
+$(po_files): update-pot
 	@if [ ! -f $@ ]; then \
 	lang=`echo $@ | $(SED) -r -e 's/\.po$$//'` # Strip .po suffix ; \
 	echo Creating nonexistent $@, you should add this file to your SCM repository; \
 	$(MSGINIT) --locale $$lang --no-translator -i $(DOMAIN).pot -o $@; \
 	fi; \
 	echo Merging $(DOMAIN).pot into $@; \
 	$(MSGMERGE) --no-fuzzy-matching -o $@ $@ $(DOMAIN).pot
 
+strip-po:
+	@for po_file in $$(ls *.po); do \
+		echo Stripping $$po_file; \
+		$(MSGATTRIB) --translated --no-fuzzy --no-location $$po_file  $$po_file.tmp; \
+		mv $$po_file.tmp $$po_file; \
+	done
+	@echo Remove empty translation files; \
+	find . -name '*.po' -empty -exec rm -v {} \;
+
 create-po: $(DOMAIN).pot
 	@for po_file in $(po_files); do \
 	if [ ! -e $$po_file ]; then \
@@ -98,10 +108,14 @@ create-po: $(DOMAIN).pot
 
 pull-po:
 	cd ../..; $(TX) pull -f
+	$(MAKE) strip-po
 
-update-po: