Hi Lars,

Here is another way:

  find OOE680_m6 -name '*sdf' | xargs cat |
    awk '-F\t' '$10=="sv"{print $11}' | sed 's/~//g;s/\\[nt]/ /g'

Apparently, tilde precedes the underlined shortcut letter in a menus (E~xport), and the texts contain \n for newlines.

I was removing tildes etc. from the result file, but I kept them in the source txt file - because you need to find the source faulty segment, and without knowing the tilde and the rest of "formatting garbage", you cannot pin down the right sdf. Of course, we could include the ID and set its style to "no language" in ODF (translating to ODF is easy in this case, and could be done with awk and zip), but I started with something much simpler.

In the future, I think that a simple style tagger should be used. Let me explain: there should be "no language" special style for help tags etc. so that they would not be checked. Most translation tools support such things, for example free TortoiseTagger for Word, OmegaT does it, and MemoQ or Across (all free and/or open source), not to mention enlasotools (dedicated filter set) but probably awk would be enough even for XML tagging in the help file. So two files would be needed: a complete text file (probably with some additional info like IDs), and tagged ODF file for spell and grammar checking.

Yet I haven't yet started working on that as the schedule is unrealistically tight for additional translation QA _before_ release and _after_ integrating the translation. My idea was born out of the fact that Polish translations had broken characters in latest builds just because of some faulty conversion to UFT-8, and that would be detected automatically using spell-check. So this should be a step in testing before the release, and after integrating the localized strings.

See my proposal:
http://wiki.services.openoffice.org/wiki/Automating_Translation_QA

I have no experience from the tools used in translation. Is anything like Alchemy Catalyst available as free software? Could such functionality be built into future releases of OpenOffice? I would think that OpenOffice has many users who are translators, especially since the software is adopted in poorer countries where all kinds of languages are spoken.

Catalyst is free but only in a very restricted version (no way to create new projects). But it's only one of the tools that's available, as I mentioned above. Anyway, these tests are quite trivial to implement using sed, awk and other standard Unix tools which can run happily in Win32 using cygwin.

Regards,
Marcin

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to