On 04/18/2012 07:33 AM, Petr Viktorin wrote:
On 04/16/2012 10:32 PM, John Dennis wrote:
On 04/12/2012 09:26 AM, Petr Viktorin wrote:
On 03/30/2012 03:45 AM, John Dennis wrote:
Translatable strings have certain requirements for proper translation
and run time behaviour. We should routinely validate those strings. A
recent checkin to install/po/test_i18n.py makes it possible to validate
the contents of our current pot file by searching for problematic
strings.

However, we only occasionally generate a new pot file, far less
frequently than translatable strings change in the source base. Just
like other checkin's to the source which are tested for correctness we
should also validate new or modified translation strings when they are
introduced and not accumulate problems to fix at the last minute. This
would also raise the awareness of developers as to the requirements for
proper string translation.

The top level Makefile should be enhanced to create a temporary pot
files from the current sources and validate it. We need to use a
temporary pot file because we do not want to modify the pot file under
source code control and exported to Transifex.


NACK

install/po/Makefile is not created early enough when running `make rpms`
from a clean checkout.

# git clean -fx
...
# make rpms
rm -rf /rpmbuild
mkdir -p /rpmbuild/BUILD
mkdir -p /rpmbuild/RPMS
mkdir -p /rpmbuild/SOURCES
mkdir -p /rpmbuild/SPECS
mkdir -p /rpmbuild/SRPMS
mkdir -p dist/rpms
mkdir -p dist/srpms
if [ ! -e RELEASE ]; then echo 0>  RELEASE; fi
sed -e s/__VERSION__/2.99.0GITde16a82/ -e s/__RELEASE__/0/ \
freeipa.spec.in>  freeipa.spec
sed -e s/__VERSION__/2.99.0GITde16a82/ version.m4.in \
version.m4
sed -e s/__VERSION__/2.99.0GITde16a82/ ipapython/setup.py.in \
ipapython/setup.py
sed -e s/__VERSION__/2.99.0GITde16a82/ ipapython/version.py.in \
ipapython/version.py
perl -pi -e "s:__NUM_VERSION__:2990:" ipapython/version.py
perl -pi -e "s:__API_VERSION__:2.34:" ipapython/version.py
sed -e s/__VERSION__/2.99.0GITde16a82/ daemons/ipa-version.h.in \
daemons/ipa-version.h
perl -pi -e "s:__NUM_VERSION__:2990:" daemons/ipa-version.h
perl -pi -e "s:__DATA_VERSION__:20100614120000:" daemons/ipa-version.h
sed -e s/__VERSION__/2.99.0GITde16a82/ -e s/__RELEASE__/0/ \
ipa-client/ipa-client.spec.in>  ipa-client/ipa-client.spec
sed -e s/__VERSION__/2.99.0GITde16a82/ ipa-client/version.m4.in \
ipa-client/version.m4
if [ "redhat" != "" ]; then \
sed -e s/SUPPORTED_PLATFORM/redhat/ ipapython/services.py.in \
ipapython/services.py; \
fi
if [ "" != "yes" ]; then \
./makeapi --validate; \
fi
make -C install/po validate-src-strings
make[1]: Entering directory `/home/pviktori/freeipa/install/po'
make[1]: *** No rule to make target `validate-src-strings'. Stop.
make[1]: Leaving directory `/home/pviktori/freeipa/install/po'
make: *** [validate-src-strings] Error 2

Updated patch attached.

The fundamental problem is that we were trying to run code before
configure had ever been run. We have dependencies on the output and side
effects of configure. Therefore the solution is to add the
bootstrap-autogen target as a dependency.

FWIW, I tried an approach that did not require having bootstrap-autogen
run first by having the validation occur as part of the normal "make
all". But that had some undesirable properties, first we really only
want to run validation for developers, not during a normal build,
secondly the file generation requires a git repo (see below).

But there is another reason to require running bootstrap-autogen prior
to any validation (e.g. makeapi, make-lint, etc.) Those validation
utilities need access to generated source files and those generated
source files are supposed to be generated by the configure step. Right
now we're generating them as part of updating version information. I
will post a long email describing the problem on the devel list. So
we've created a situation we we must run configure early on, even as
part of "making the distribution" because part of "making the
distribution" is validating the distribution and that requires the full
content of the distribution be available.

Also while trying to determine if the i18n validation step executed
correctly I realized the i18n validation only emitted a message if an
error was detected. That made it impossible to distinguish between empty
input (a problem when not running in a development tree) and successful
validation. Therefore I also updated i18n.py to output counts of
messages checked which also caused me to fix some validations that were
missing on plural forms.



This still doesn't solve the problem: Make doesn't guarantee the order
of dependencies, so with the rule:
  >  lint: bootstrap-autogen validate-src-strings
  >          ./make-lint $(LINT_OPTIONS)
the validate-src-strings can fire before bootstrap-autogen:

Fixed by having bootstrap-autogen be a dependency of the lint target and adding validate-src-strings to the lint rules.


$ make lint
if [ ! -e RELEASE ]; then echo 0>  RELEASE; fi
/usr/bin/make -C install/po validate-src-strings
make[1]: Entering directory `/home/pviktori/freeipa/install/po'
make[1]: *** No rule to make target `validate-src-strings'.  Stop.
make[1]: Leaving directory `/home/pviktori/freeipa/install/po'
make: *** [validate-src-strings] Error 2
make: *** Waiting for unfinished jobs....

When I move the bootstrap-autogen dependency to validate-src-strings, it
works. (Quite a lot of needed for a lint now, but discussion about that
is elsewhere.)

I'm not sure if you're concern is with having to run bootstrap-autogen for lint or adding the string validation to the lint check.

As I tried to point out in my email irrespective of validating the i18n strings we should have run bootstrap-autogen prior to lint because that is what is supposed to create the generated Python file(s) that we're asking pylint to validate.

If the concern is with validating the i18n strings as part of lint then I'll make these observations:

1) validating the source strings is logically part of the lint check. Why? Because lint validates source files. The i18n string validation is also validating the contents of the same source files, it's just doing a job traditional lint can't perform, thus it's a logical extension of a lint type validation.

2) If the concern is with performing extra steps, performance, elapsed time to run the lint check etc. then

  a) On my laptop i18n validation takes 0.77s elapsed time

  b) On my laptop make-lint takes 1m17s or 77s elapsed time.

  c) Validating i18n strings is 100x faster than lint'ing the source
  code, or put another way adding i18n validation adds a mere 1% to
  the elapsed time needed to run lint.



Now that there are warnings, is pedantic mode necessary?

Great question, I also pondered that as well. My conclusion was there was value in separating aggressiveness of error checking from the verbosity of the output. Also I didn't think we wanted warnings showing in normal checking for things which are suspicious but not known to be incorrect. So under the current scheme pedantic mode enables reporting of suspicious constructs. You can still get a warning in the normal mode for things which aren't fatal but are provably incorrect. An example of this would be missing plural translations, it won't cause a run time failure and we can be definite about their absence, however they should be fixed, but it's not mandatory they be fixed, a warning in this case seems appropriate.

The validate_file function still contains some `return 1` lines, you
should update them all. Or just raise exceptions in these cases.

$ tests/i18n.py --validate-pot nonexisting_file
file does not exist "nonexisting_file"
Traceback (most recent call last):
    File "tests/i18n.py", line 817, in<module>
      sys.exit(main())
    File "tests/i18n.py", line 761, in main
      n_entries, n_msgids, n_msgstrs, n_warnings, n_errors =
validate_file(f, validation_mode)
TypeError: 'int' object is not iterable

Good catch, thank you! (see below)

I'd also return a namedtuple instead of a plain tuple, and have the
caller access the items by name. That way, if validate_file decides to
return more items in the future, we won't have to rewrite all calls to it.

Yeah, FWIW I thought about making the return values named tuple when I modified the code (are you a mind reader?). Not sure why I didn't, probably because I wanted to limit the engineering effort. Anyway, it's now a named tuple and the return values have been sanitized.

Revised patch attached.



--
John Dennis <jden...@redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/
>From 874e0dcc8f64eab135a31de2e6d0aced21a1b4ab Mon Sep 17 00:00:00 2001
From: John Dennis <jden...@redhat.com>
Date: Mon, 16 Apr 2012 15:51:42 -0400
Subject: [PATCH 70-2] validate i18n strings when running "make lint"
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit

* Add bootstrap-autogen depdenency to lint target to force
  generated files to be created.

* Add validate-src-strings to lint rules

* Add validate-src-strings as dependency to lint targett

* Remove obsolete test_lang frm test target

* Add diagnostic message to validation command in i18n.py
  that outputs how many objects were scanned. Formerly it only
  output a message if there were errors. This made it impossible to
  distinguish an empty file from one with no errors.

* While adding the validation counts it was discovered plurals had
  been omitted for some of the validation checks. Added the missing
  checks for plural forms.

* Also distinguished between errors and warnings. Permit warnings to
  be emitted but do not fail the validatition unless actual errors
  were also detected.
---
 Makefile               |    9 ++-
 install/po/Makefile.in |   10 +++-
 tests/i18n.py          |  151 +++++++++++++++++++++++++++++++++++++++--------
 3 files changed, 139 insertions(+), 31 deletions(-)

diff --git a/Makefile b/Makefile
index 07abf6f..1025591 100644
--- a/Makefile
+++ b/Makefile
@@ -88,11 +88,12 @@ client-dirs:
 		echo "Without those directories ipa-client-install will fail" ; \
 	fi
 
-lint:
+lint: bootstrap-autogen
 	./make-lint $(LINT_OPTIONS)
+	$(MAKE) -C install/po validate-src-strings
+
 
 test:
-	$(MAKE) -C install/po test_lang
 	./make-testcert
 	./make-test
 
@@ -120,12 +121,12 @@ version-update: release-update
 		ipa-client/ipa-client.spec.in > ipa-client/ipa-client.spec
 	sed -e s/__VERSION__/$(IPA_VERSION)/ ipa-client/version.m4.in \
 		> ipa-client/version.m4
-	
+
 	if [ "$(SUPPORTED_PLATFORM)" != "" ]; then \
 		sed -e s/SUPPORTED_PLATFORM/$(SUPPORTED_PLATFORM)/ ipapython/services.py.in \
 			> ipapython/services.py; \
 	fi
-	
+
 	if [ "$(SKIP_API_VERSION_CHECK)" != "yes" ]; then \
 		./makeapi --validate; \
 	fi
diff --git a/install/po/Makefile.in b/install/po/Makefile.in
index 02561ed..37462bd 100644
--- a/install/po/Makefile.in
+++ b/install/po/Makefile.in
@@ -158,7 +158,7 @@ install: $(mo_files)
 	done
 
 mostlyclean:
-	rm -rf *.mo test.po test_locale
+	rm -rf *.mo test.po test_locale tmp.pot
 	rm -f $(DOMAIN).pot.update $(DOMAIN).pot.update.tmp $(DOMAIN).pot.tmp
 
 clean: mostlyclean
@@ -177,6 +177,14 @@ validate-pot:
 validate-po:
 	$(IPA_TEST_I18N) --show-strings --validate-po $(po_files)
 
+validate-src-strings:
+	@rm -f tmp.pot
+	@touch tmp.pot
+	@$(MAKE) DOMAIN=tmp update-pot; \
+	status=$$?; \
+	rm tmp.pot; \
+	exit $$status
+
 debug:
 	@echo Python potfiles:
 	@echo PY_FILES = $(PY_FILES)
diff --git a/tests/i18n.py b/tests/i18n.py
index 067bc5e..703dc8b 100755
--- a/tests/i18n.py
+++ b/tests/i18n.py
@@ -29,6 +29,7 @@ import re
 import os
 import traceback
 import polib
+from collections import namedtuple
 
 '''
 We test our translations by taking the original untranslated string
@@ -379,51 +380,138 @@ def validate_file(file_path, validation_mode):
     Returns the number of entries with errors.
     '''
 
+    def emit_messages():
+        if n_warnings:
+            warning_lines.insert(0, section_seperator)
+            warning_lines.insert(1, "%d validation warnings in %s" % (n_warnings, file_path))
+            print '\n'.join(warning_lines)
+
+        if n_errors:
+            error_lines.insert(0, section_seperator)
+            error_lines.insert(1, "%d validation errors in %s" % (n_errors, file_path))
+            print '\n'.join(error_lines)
+
+    Result = namedtuple('ValidateFileResult', ['n_entries', 'n_msgids', 'n_msgstrs', 'n_warnings', 'n_errors'])
+
+    warning_lines = []
     error_lines = []
-    n_entries_with_errors = 0
+    n_entries = 0
+    n_msgids = 0
+    n_msgstrs = 0
+    n_entries = 0
+    n_warnings = 0
+    n_errors  = 0
+    n_plural_forms = 0
 
     if not os.path.isfile(file_path):
-        print >>sys.stderr, 'file does not exist "%s"' % (file_path)
-        return 1
+        error_lines.append(entry_seperator)
+        error_lines.append('file does not exist "%s"' % (file_path))
+        n_errors += 1
+        emit_messages()
+        return Result(n_entries=n_entries, n_msgids=n_msgids, n_msgstrs=n_msgstrs, n_warnings=n_warnings, n_errors=n_errors)
+
     try:
         po = polib.pofile(file_path)
     except Exception, e:
-        print >>sys.stderr, 'Unable to parse file "%s": %s' % (file_path, e)
-        return 1
+        error_lines.append(entry_seperator)
+        error_lines.append('Unable to parse file "%s": %s' % (file_path, e))
+        n_errors += 1
+        emit_messages()
+        return Result(n_entries=n_entries, n_msgids=n_msgids, n_msgstrs=n_msgstrs, n_warnings=n_warnings, n_errors=n_errors)
 
+
+    if validation_mode == 'po':
+        plural_forms = po.metadata.get('Plural-Forms')
+        if not plural_forms:
+            error_lines.append(entry_seperator)
+            error_lines.append("%s: does not have Plural-Forms header" % file_path)
+            n_errors += 1
+        match = re.search(r'\bnplurals\s*=\s*(\d+)', plural_forms)
+        if match:
+            n_plural_forms = int(match.group(1))
+        else:
+            error_lines.append(entry_seperator)
+            error_lines.append("%s: does not specify integer nplurals in Plural-Forms header" % file_path)
+            n_errors += 1
+
+    n_entries = len(po)
     for entry in po:
+        entry_warnings = []
         entry_errors = []
-        msgid = entry.msgid
-        msgstr = entry.msgstr
-        have_msgid = msgid.strip() != ''
-        have_msgstr = msgstr.strip() != ''
+        have_msgid = entry.msgid.strip() != ''
+        have_msgid_plural = entry.msgid_plural.strip() != ''
+        have_msgstr = entry.msgstr.strip() != ''
+
+        if have_msgid:
+            n_msgids += 1
+        if have_msgid_plural:
+            n_msgids += 1
+        if have_msgstr:
+            n_msgstrs += 1
+
         if validation_mode == 'pot':
+            prog_langs = get_prog_langs(entry)
             if have_msgid:
-                prog_langs = get_prog_langs(entry)
-                errors = validate_positional_substitutions(msgid, prog_langs, 'msgid')
+                errors = validate_positional_substitutions(entry.msgid, prog_langs, 'msgid')
                 entry_errors.extend(errors)
-        if validation_mode == 'po':
-            if have_msgid and have_msgstr:
-                errors = validate_substitutions_match(msgid, msgstr, 'msgid', 'msgstr')
+            if have_msgid_plural:
+                errors = validate_positional_substitutions(entry.msgid_plural, prog_langs, 'msgid_plural')
                 entry_errors.extend(errors)
+        elif validation_mode == 'po':
+            if have_msgid:
+                if have_msgstr:
+                    errors = validate_substitutions_match(entry.msgid, entry.msgstr, 'msgid', 'msgstr')
+                    entry_errors.extend(errors)
+
+                if have_msgid_plural and have_msgstr:
+                    n_plurals = 0
+                    for index, msgstr in entry.msgstr_plural.items():
+                        have_msgstr_plural = msgstr.strip() != ''
+                        if have_msgstr_plural:
+                            n_plurals += 1
+                            errors = validate_substitutions_match(entry.msgid_plural, msgstr, 'msgid_plural', 'msgstr_plural[%s]' % index)
+                            entry_errors.extend(errors)
+                        else:
+                            entry_errors.append('msgstr_plural[%s] is empty' % (index))
+                    if n_plural_forms != n_plurals:
+                        entry_errors.append('%d plural forms specified, but this entry has %d plurals' % (n_plural_forms, n_plurals))
+
         if pedantic:
             if have_msgid:
-                errors = validate_substitution_syntax(msgid, 'msgid')
-                entry_errors.extend(errors)
+                errors = validate_substitution_syntax(entry.msgid, 'msgid')
+                entry_warnings.extend(errors)
+
+            if have_msgid_plural:
+                errors = validate_substitution_syntax(entry.msgid_plural, 'msgid_plural')
+                entry_warnings.extend(errors)
+
+                errors = validate_substitutions_match(entry.msgid, entry.msgid_plural, 'msgid', 'msgid_plural')
+                entry_warnings.extend(errors)
+
+                for index, msgstr in entry.msgstr_plural.items():
+                    have_msgstr_plural = msgstr.strip() != ''
+                    if have_msgstr_plural:
+                        errors = validate_substitution_syntax(msgstr,  'msgstr_plural[%s]' % index)
+                        entry_warnings.extend(errors)
+
             if have_msgstr:
-                errors = validate_substitution_syntax(msgstr, 'msgstr')
-                entry_errors.extend(errors)
+                errors = validate_substitution_syntax(entry.msgstr, 'msgstr')
+                entry_warnings.extend(errors)
+
+        if entry_warnings:
+            warning_lines.append(entry_seperator)
+            warning_lines.append('locations: %s' % (', '.join(["%s:%d" % (x[0], int(x[1])) for x in entry.occurrences])))
+            warning_lines.extend(entry_warnings)
+            n_warnings += 1
+
         if entry_errors:
             error_lines.append(entry_seperator)
             error_lines.append('locations: %s' % (', '.join(["%s:%d" % (x[0], int(x[1])) for x in entry.occurrences])))
             error_lines.extend(entry_errors)
-            n_entries_with_errors += 1
-    if n_entries_with_errors:
-        error_lines.insert(0, section_seperator)
-        error_lines.insert(1, "%d validation errors in %s" % (n_entries_with_errors, file_path))
-        print '\n'.join(error_lines)
+            n_errors += 1
 
-    return n_entries_with_errors
+    emit_messages()
+    return Result(n_entries=n_entries, n_msgids=n_msgids, n_msgstrs=n_msgstrs, n_warnings=n_warnings, n_errors=n_errors)
 
 
 #----------------------------------------------------------------------
@@ -676,10 +764,21 @@ def main():
             print >> sys.stderr, 'ERROR: unknown validation mode "%s"' % (options.mode)
             return 1
 
+        total_entries = 0
+        total_msgids = 0
+        total_msgstrs = 0
+        total_warnings = 0
         total_errors = 0
+
         for f in files:
-            n_errors = validate_file(f, validation_mode)
-            total_errors += n_errors
+            result = validate_file(f, validation_mode)
+            total_entries += result.n_entries
+            total_msgids += result.n_msgids
+            total_msgstrs += result.n_msgstrs
+            total_warnings += result.n_warnings
+            total_errors += result.n_errors
+            print "%s: %d entries, %d msgid, %d msgstr, %d warnings %d errors" % \
+                (f, result.n_entries, result.n_msgids, result.n_msgstrs, result.n_warnings, result.n_errors)
         if total_errors:
             print section_seperator
             print "%d errors in %d files" % (total_errors, len(files))
-- 
1.7.7.6

_______________________________________________
Freeipa-devel mailing list
Freeipa-devel@redhat.com
https://www.redhat.com/mailman/listinfo/freeipa-devel

Reply via email to