Re: [users] Re: Format of .dic Files

Thomas Lange - Sun Germany - ham02 - Hamburg Mon, 09 Jul 2007 00:46:26 -0700

Hello Russell Butler,

> Just for completeness in this thread, and in case someone sees it in the
> archives, here is the gist of a reply I sent to Das privately.
> 
>  For some reason it works when you open the personal.dic file with UTF-8
> encoding. This was just opening the file directly, not preprocessing
> with Kelvin's macro.
> 
> OK - trying to be explicit:
> 
> 1 In openoffice.org: File-open personal.dic (may be in
> ~/ooo-version/user/wordbook/personal.dic )
> 
> 2 Ascii Filter options:
>  Character set UTF-8
>  font whatever you wish
>  Language None
>  Paragraph  break LF
> 
> 3 This will give you the list with words separated by # or ## You will
> find a ##WBSWG6�## or similar at the beginning of the file, this can be
> deleted.
> 
> 4 Use search and replace with regular expressions to separate words into
> paragraphs
> 
> 5 Ctrl-A then Sort Ascending
> 
> 6 Clean up as desired.


Yes using UTF-8 as charset encoding will show the strings properly since
the WBSWG6 binary file format uses that encoding for the strings.

But even though the strings are UTF-8 encoded the dictionary file itself
is not! It is a binary format. That is you can not expect all characters
(that is here the non-string parts of that file) to be properly
displayed or even read!
And thus it is particularly unlikely that saving such a binary file from
within a text editor after modifying it will be a good idea.

To sum it up: As long as you do not save the edited file as personal
dictionary again I see no problem with your approach.


Regards,
Thomas


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: [users] Re: Format of .dic Files

Reply via email to