Hi Lars,
Here is another way:
find OOE680_m6 -name '*sdf' | xargs cat |
awk '-F\t' '$10=="sv"{print $11}' | sed 's/~//g;s/\\[nt]/ /g'
Apparently, tilde precedes the underlined shortcut letter in a
menus (E~xport), and the texts contain \n for newlines.
I was removing tildes etc. from the result file, but I kept them in the
source txt file - because you need to find the source faulty segment,
and without knowing the tilde and the rest of "formatting garbage", you
cannot pin down the right sdf. Of course, we could include the ID and
set its style to "no language" in ODF (translating to ODF is easy in
this case, and could be done with awk and zip), but I started with
something much simpler.
In the future, I think that a simple style tagger should be used. Let me
explain: there should be "no language" special style for help tags etc.
so that they would not be checked. Most translation tools support such
things, for example free TortoiseTagger for Word, OmegaT does it, and
MemoQ or Across (all free and/or open source), not to mention
enlasotools (dedicated filter set) but probably awk would be enough even
for XML tagging in the help file. So two files would be needed: a
complete text file (probably with some additional info like IDs), and
tagged ODF file for spell and grammar checking.
Yet I haven't yet started working on that as the schedule is
unrealistically tight for additional translation QA _before_ release and
_after_ integrating the translation. My idea was born out of the fact
that Polish translations had broken characters in latest builds just
because of some faulty conversion to UFT-8, and that would be detected
automatically using spell-check. So this should be a step in testing
before the release, and after integrating the localized strings.
See my proposal:
http://wiki.services.openoffice.org/wiki/Automating_Translation_QA
I have no experience from the tools used in translation. Is
anything like Alchemy Catalyst available as free software? Could
such functionality be built into future releases of OpenOffice?
I would think that OpenOffice has many users who are translators,
especially since the software is adopted in poorer countries where
all kinds of languages are spoken.
Catalyst is free but only in a very restricted version (no way to create
new projects). But it's only one of the tools that's available, as I
mentioned above. Anyway, these tests are quite trivial to implement
using sed, awk and other standard Unix tools which can run happily in
Win32 using cygwin.
Regards,
Marcin
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]