At 13:45 24/12/2014 -0700, Constantine Marberg wrote:
This Terminology is unfortunately in simple text-files encoded with UTF8 but with no TABs or similar as field-separators. In order to be able to use this Terminology as a glossary or to convert it into a TMX and then use it efficiently with our CAT-Software (OmegaT mostly), it has to be in the following format:

Language-A TAB Language-B
or
Language-A ; Language-B

Does anyone of you know any way to achieve this in Libreoffice or even any other editor? I was think of something like: Search for German-language characters + space at the beginning of the sentence/line and replace it with the same word + ";" or TAB

It won't be easy to search for characters in a particular language, but in all your examples the part you want separated from the remainder of the line is simply a single word, so you need just to search for the first space and replace it with a semi-colon or a tab character.

o Paste your material into a LibreOffice text (Writer) document.
o Put the cursor at the beginning of the text.
o Go to Edit | Find & Replace... (or press Ctrl+F).
o In the "Search for" box, enter:
([^ ]*) (.*)
- that's leftparenthesis-leftbracket-circumflex-space-rightbracket-asterisk-rightparenthesis-space-leftparenthesis-dot-asterisk-rightparenthesis.
o In the "Replace with" box, enter either:
$1;$2 or $1\t$2
- that's dollar-one-semicolon-dollar-two or dollar-one-backslash-lowercasetee-dollar-two (as preferred).
o Click More Options.
o Ensure "Regular expressions" is ticked.
o Click Replace All.

Don't do this more than once, or this process will then replace the second occurrences of spaces.

You can copy and paste the resulting material wherever you want it to go, or you can use File | Save As... and select a plain text format for "Save as type".

I trust this helps.

Brian Barker


--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to