On 24-12-2014 21:45, Constantine wrote:
Hi everyone,

I am new here and I hope someone can help me.

I am a translator and my colleagues and I use CAT-software with TMX-files,
glossaries in simple text-files which are in UTF8 encoding and TAB-separated
as well as other tools of course.

Over the years, some of us collected a huge amount of Terminology (with it's
definitions) from each other and various other sources.
This Terminology is unfortunately in simple text-files encoded with UTF8 but
with no TABs or similar as field-separators.

In order to be able to use this Terminology as a glossary or to convert it
into a TMX and then use it efficiently with our CAT-Software (OmegaT
mostly), it has to be in the following format:

Language-A TAB Language-B
or
Language-A ; Language-B

I 'll show you an example of our files to better understand my problem.
I will use a German-Greek terminology example, because I believe the
different languages encodings will make it easier to achieve our goal:

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten (πληθ.) δαπάνες κατεδάφισης
abbruchreif ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen (ουδ.) εταιρεία κατεδάφισης
abbuchen χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

As a minimum requirement for our goal, it should look like this:

Abbrucharbeiten;(πληθ.) εργασίες κατεδάφισης
Abbruchkosten;(πληθ.) δαπάνες κατεδάφισης
abbruchreif;ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung;(θηλ.) άδεια κατεδάφισης
Abbruchunternehmen;(ουδ.) εταιρεία κατεδάφισης
abbuchen;χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

or

Abbrucharbeiten (πληθ.) εργασίες κατεδάφισης
Abbruchkosten   (πληθ.) δαπάνες κατεδάφισης
abbruchreif     ετοιμόρροπος, κατεδαφιστέος, das Haus ist ~ το σπίτι είναι
ετοιμόρροπο
Abbruchsbewilligung     (θηλ.) άδεια κατεδάφισης
Abbruchunternehmen      (ουδ.) εταιρεία κατεδάφισης
abbuchen        χρεώνω, Gebühren vom Konto ~ τα τέλη χρεώνονται στον τραπεζικό
λογαριασμό

(where the space is a TAB)

Does anyone of you know any way to achieve this in Libreoffice or even any
other editor?

I was think of something like:
                                Search for German-language characters + space
at the beginning of the sentence/line and
                                replace it with the same word + ";" or TAB

I and my colleagues would be most grateful if any of you could provide a
simple solution or suggestion.

I thank you all in advance for your help and interest.

Constantine



After a lot of responses how to do this in Writer,
a shortnote how to do this in Calc..... ;-)

Open the textfile, when the 'Text import' wizzard is show do:
1) Select characterset 'Unicode (UTF-8)'
2) Separater options: 'separated by', check 'Tab' and 'Space', other options should not be checked.
3) at 'Text delimiter' type a space
4) klik 'OK'

5) Insert a column B, and fill it with a semi-colon ';'

6) Klik save-as, type a name, and check 'Edit filter settings'
7) The Export Text file' wizard should be shown.
8) Character set: 'Unicode (UTF-8)'
9) Field delimiter: space ' '
10) Text delimiter: <empty> ''
11) checkboxes: only leave 'Save cell content as shown' checked.....





--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to