Re: [l10n-dev] IMPORTANT: OpenOffice.org 3.3 - translation schedule

Goran Rakic Wed, 23 Jun 2010 16:41:08 -0700

У сре, 23. 06 2010. у 10:04 +0200, F Wolff пише:
> > 
> > Hmm .. I'm not really keen of having "yet another tool". If we are 
> > working on something, this should be done within translate toolkit.
> 
> The tool that Goran mentions, pomigrate2 is supposed to handle this to
> some extent.  Goran, can you maybe help us see what is currently missing
> in pomigrate2 to handle this situation?

Hi Friedel, all

Thank you for your comments. I do not know a lot about translate toolkit
source, so please forgive me if I say something wrong here. What I would
like it to have your feedback on some ideas and this may lead to a
better work flow or better tools. I should say that I am very happy to
be able to use Translate toolkit. It helps me a lot to create better
translations for OpenOffice.org, as I am sure it helps many others.

What I see as missing from pomigrate2 is that PO catalogs should be
actually read on migration, relaying just on filenames is not reliable
enough.

I think about OpenOffice.org as a single gettext domain. This split to
many PO files that we have is to help us managing translations but these
files are not different domains. In KDE or GNOME you do not get string
moved from the Minesweeper game to the email client between releases
while in OpenOffice.org it is easy to have string moved between modules.

Using pomigrate2 with compendium option enabled is close to what I would
like and I can usually get better result when using it but using
compendium with the current from POT files as generated by oo2po is not
the same as moving as messages are merged together. I want to apply
compendium after the messages are moved to get fuzzy matching against
new strings, but I would like to have existing translation simply moved
between files.

As pomigrate2 relies on filenames it also can give a huge bloat. When
testautmation was included in the m83 SDF by mistake we got a new UI.po
file in the testautomation directory with 3 new messages. pomigrate2
found a large UI.po file in the source three and copied hundreds of
messages kept as deprecated translation memory. Three new messages where
actually left as untranslated. :)

= Contexts everywhere =

First I would like if oo2po could provide msgctxt field to all messages
and not just for duplicates inside a single PO file. Resource IDs that
could be found in the SDF file are exactly what gettext context field is
designed for.

I know that this will create a lot of repetitive messages but we all
learned long ago that every string should have its context. Different
languages make distinction between noun and verb or have rich morphology
and require different word forms for different adjective case etc.

Translators can always auto apply translation memory to quickly
translate same messages with different contexts and poconflicts can be
used to check if message is translated differently while it should not
be. I believe there is no any drawback from using contexts with all
messages.

= Merge reading all catalogs =

Having msgctxt/msgid, a smarter pomigrate2 could just look for a exact
match when merging translations with new POT files. This could be
implemented either by building a compendium and asking msgmerge to do it
or having a smarter dedicated code traversing every message from new
templates.

Removed messages could be kept as deprecated translation memory in the
most probably file (same if it exists, in the file where most of the
messages are moved or, finally, in the file with the most similar
filename).

Now if for any reason resource ID changes we will always have good fuzzy
matching working on that string.

I did not do any tests but I strongly believe this will keep all
translations when messages are moved and resource ID stay the same and
still provide a good fuzzy match if resource ID changes.

= Some unrelated comments =

I wrote a wrapper around msgmerge to add --previous flag when called
from pomigrate2. That is very useful option of msgmerge when reviewing
fuzzy matches. I like even more ediff output style as provided by
Pology[1] (run posieve diff-previous lang.po after msgmerge --previous)
but that can be handled later by editors.

When using version control (we are using Subversion repository for
Serbian OpenOffice.org translations) pomerge from translate toolkit
makes a mess. It outputs all messages, not just a modified ones making
version control diffs hard to comprehend.

I like the speed of pogrep and the power of pofilter but for merging
correction back I use Pology and posieve with merge_corr_tree sieve.
(with pogrep output in ui-check run posieve merge_corr_tree ui
-spathdelta:ui:ui-check to merge fixes).

It goes over all catalogs in the ui three, looking for same in the
ui-check three and updates only modified messages keeping very nice
diffs. It would be great to have pomerge doing the same.

Best regards,
Goran Rakic

[1]
http://techbase.kde.org/Localization/Tools/Pology/PO_Embedded_Diffing#Lightweight_Diffing_when_Updating_Translation

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@l10n.openoffice.org
For additional commands, e-mail: dev-h...@l10n.openoffice.org

Re: [l10n-dev] IMPORTANT: OpenOffice.org 3.3 - translation schedule

Reply via email to