[users] Re: Format of .dic Files

Russell Butler Thu, 05 Jul 2007 16:55:54 -0700

das wrote:
> On Thu, 2007-07-05 at 11:03 +0200, Thomas Lange - Sun Germany - ham02 -
> Hamburg wrote:
>> Hi again,
>>
>> I forgot to mention that the tagged file format uses also UTF-8
>> encoding. Thus you'll need a UTF-8 capable text editor to properly
>> view
>> and edit those files.
>>
>> Also just in case:
>> The string following the language tag refers to ISO locale of the
>> language the dictionary is to be uded with.
>> E.g.  en-US would be English (USA) and de-CH would be German
>> (Swiss)...
>> And the line
>> lang: <none>
>> will be used for dictionaries that are to be used for all languages.
>>
>> Please be aware that in this file format spaces do matter!
>> Have the wrong number of spaces, use tabs, or add additional spaces at
>> the end and it may not work.
>>
>>
>> Thomas
>>


Hi all

Just for completeness in this thread, and in case someone sees it in the
archives, here is the gist of a reply I sent to Das privately.

 For some reason it works when you open the personal.dic file with UTF-8
encoding. This was just opening the file directly, not preprocessing
with Kelvin's macro.

OK - trying to be explicit:

1 In openoffice.org: File-open personal.dic (may be in
~/ooo-version/user/wordbook/personal.dic )

2 Ascii Filter options:
 Character set UTF-8
 font whatever you wish
 Language None
 Paragraph  break LF

3 This will give you the list with words separated by # or ## You will
find a ##WBSWG6�## or similar at the beginning of the file, this can be
deleted.

4 Use search and replace with regular expressions to separate words into
paragraphs

5 Ctrl-A then Sort Ascending

6 Clean up as desired.


I hope someone may find this helpful

Russell

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

[users] Re: Format of .dic Files

Reply via email to