Hi *,

On Mon, Nov 25, 2013 at 6:10 PM, Christian Lohmaier
<lohmaier+ooofut...@googlemail.com> wrote:
> On Sun, Nov 24, 2013 at 4:23 PM, Stanislav Horáček
> <stanislav.hora...@gmail.com> wrote:
>> [...]
>> The second issue was already mentioned by Andras: there is a huge amount of
>> strings in help where only meaningless (in terms of translation) identifiers
>> were changed. [...]
> Please focus on UI for now, Andras did sent me the script he did use
> previously to reduce this noise.

Andras script would have had to be applied before doing the initial
update, so I had to take a detour and write some bigger script.

> I can try to merge back the strings from the 4.1 translations. But no 
> promises.

So what I did was first to compare the templates in 4.1 and 4.2 to get
a list of old IDs and new IDs, and the affected files (to limit
processing time). This was done with a combination of script and
manual review (as it's hard to map changes when there are completely
removed and added strings as well)

The output of that script was used as basis for the other one that

* checks libo41_help for a translation of the string
* applies that to libo_help if the translation for that string is empty.

For future reference (as gmail's search is much better than my memory
:-))) - here's the hacky script that is now run. It should save you
about 12000 words.


use strict;
use warnings;
use utf8;

binmode STDIN, ':utf8';
binmode STDOUT, ':utf8';

my $language = shift;

# update files on-disk
system("python src/manage.py sync_stores --project=libo41x_help
system("python src/manage.py sync_stores --project=libo_help

my @files = qw(sbasic/shared/01.pot scalc/01.pot schart/01.pot
shared/01.pot shared/02.pot shared/autopi.pot
shared/explorer/database.pot shared/optionen.pot simpress/01.pot
smath/01.pot swriter/01.pot);

my %mapping = ();
$mapping{'53251'} = 'modules/swriter/ui/optcompatpage/default';
$mapping{'878711313'} = 'modules/swriter/ui/flddbpage/browse';
$mapping{'878871556'} =
$mapping{'878874627'} =
$mapping{'879350288'} = 'modules/swriter/ui/optcaptionpage/category';
# lots of other mappings stripped from this post

my %translations = ();
my $orig = "";
my $translation = "";
foreach my $file (@files) {
# file.pot → file.po
print "Datei: $file\n";
# get rid of those nasty multiline wraps that break grepping
system("msgcat --no-wrap translations/libo41x_help/$language/$file
while( my ($old, $new) = each %mapping) {
#brute-force, just grep the same file ~900 times. Not nice, but works
open(GREP, "grep $old /tmp/cloph-fix-helpids.$language.tmp | ") or die
"Cannot read temporary input file ($!)\n";
binmode GREP, ":utf8";
while(<GREP>) {
if (/^msgid/) {
$orig = $_;
$translation = <GREP>;
next unless $translation;
if ($translation =~ m/^msgstr/) {
$translation =~ s/$old/$new/;
$translations{$orig} = $translation;
} else {
# either there is no translation or it matches a commented-out entry
print "translation doesn't start with msgstr! ($translation)";
close GREP;
unlink "/tmp/cloph-fix-helpids.$language.tmp";

foreach my $file (@files) {
print "Datei: $file\n";
open(ORIG, "msgcat --no-wrap translations/libo_help/$language/$file
|") or die "Cannot read input file $file ($!)\n";
binmode ORIG, ":utf8";
open(MOD, ">", "translations/libo_help/$language/$file.mod") or die
"Cannot read output file $file.mod ($!)\n";
binmode MOD, ":utf8";
my $discard = "";
while(<ORIG>) {
if ($translations{$_}) {
print MOD;
$discard = <ORIG>;
# only update if translation is empty
if ($discard =~ m/^msgstr ""/) {
print MOD $translations{$_};
} else {
print MOD $discard;
} else {
print MOD;
close ORIG;
close MOD;
rename "translations/libo_help/$language/$file.mod",
# update database from file on-disk and rerun update_against_templates
(just to be safe)
system("python src/manage.py update_stores --project=libo_help
system("python src/manage.py update_against_templates
--project=libo_help --language=$language");

And for the sake of completeness the script to collect the changed IDs:

use warnings;
use strict;
use utf8;
binmode STDIN, ':utf8';
binmode STDOUT, ':utf8';

my %replacements = ();

open(DIFF, "diff -r libo41x_help/ libo_help/ |") || die "Cannot run diff $!";

my @old = ();
my @new = ();
my $filename = "";
my %affectedfiles = ();
while(<DIFF>) {
if (/^diff/) {
($filename) = m§diff -r libo41x_help/templates/([^ ]*)§;

if (@old) {
if (@old != @new) {
# catch unbalanced amount of replacements
print"\n".@old." vs ".@new."\n";
for my $i (0 .. $#old) {
print "$old[$i]\t$new[$i]\n";
exit 1;
} else {
@replacements{@old} = @new;
@old = ();
@new = ();
} elsif (/ahelp hid/) {
my ($direction, $id) = m/^(.).*ahelp hid=\\"([^\\]*)/;
# ignore removed IDs
next if $id eq "HID_FUNC_XOR";
next if $id eq ".";
$affectedfiles{$filename} = 1;
if ($direction eq "<")  {
push(@old, $id);
} else {
push(@new, $id);

# make sure to manually check entries
foreach my $key (sort (keys %replacements)) {
if ($key ne $replacements{$key}) {
print "\$mapping{'$key'}\t= '$replacements{$key}';\n";
print "Number of replacements:\t".keys(%replacements)."\n";
print "Number of affected files:\t".keys(%affectedfiles)."\n";
print "Affected files:\t".join(" ",sort(keys(%affectedfiles)))."\n";


