Hi everyone,

I am new here and I hope someone can help me.

I am a translator and my colleagues and I use CAT-software with TMX-files, 
glossaries in simple text-files which are in UTF8 encoding and TAB-separated
as well as other tools of course.

Over the years, some of us collected a huge amount of Terminology (with it's
definitions) from each other and various other sources.
This Terminology is unfortunately in simple text-files encoded with UTF8 but
with no TABs or similar as field-separators.

In order to be able to use this Terminology as a glossary or to convert it
into a TMX and then use it efficiently with our CAT-Software (OmegaT
mostly), it has to be in the following format:

Language-A TAB Language-B
or 
Language-A ; Language-B

I 'll show you an example of our files to better understand my problem. 
I will use a German-Greek terminology example, because I believe the
different languages encodings will make it easier to achieve our goal:

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten (πληθ.) δαπάνες κατεδάφισης
abbruchreif ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen (ουδ.) εταιρεία κατεδάφισης
abbuchen χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

As a minimum requirement for our goal, it should look like this:

Abbrucharbeiten;(πληθ.) εργασίες κατεδάφισης
Abbruchkosten;(πληθ.) δαπάνες κατεδάφισης
abbruchreif;ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung;(θηλ.) άδεια κατεδάφισης
Abbruchunternehmen;(ουδ.) εταιρεία κατεδάφισης
abbuchen;χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

or

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten   (πληθ.) δαπάνες κατεδάφισης
abbruchreif     ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung     (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen      (ουδ.) εταιρεία κατεδάφισης
abbuchen        χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

(where the space is a TAB)

Does anyone of you know any way to achieve this in Libreoffice or even any
other editor?

I was think of something like:
                               Search for German-language characters + space
at the beginning of the sentence/line and
                               replace it with the same word + ";" or TAB

I and my colleagues would be most grateful if any of you could provide a
simple solution or suggestion.

I thank you all in advance for your help and interest.

Constantine




--
View this message in context: 
http://nabble.documentfoundation.org/Creating-a-dictionary-with-libreoffice-from-a-simple-TXT-file-tp4133988.html
Sent from the Users mailing list archive at Nabble.com.

-- 
To unsubscribe e-mail to: [email protected]
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to