On 06/24/2012 05:01 PM, webmaster-Kracked_P_P wrote:

> BUT, I would like to make sure all of my .oxt dictionaries have the 
> words/terms we use every day in articles and email support for LibreOffice 
> and other open source related "items".

If you have the disk capacity, then:
* Download the Wikipedia article database;
* Run a script that writes each word it finds into a file;
* Manually go through the list, to pick up misspellings;
* Merge the "correct words" list into your existing wordlist;
* Merge the "known misspelling" list into the autocorrect list;

Two potential issues with this approach:
* Names of individuals, organizations, and things are included;
* Foreign words are included;

Whilst there are ways to eliminate both of those problems, the usual
result, when doing so using scripts, is that legitimate words in the
target language are removed, along with the foreign word, or nouns.  As
one example, the Afrikaans dictionary omitted the word "die" for several
years, because the script that was used to eliminate non-Afrikaans
words, read that word as the English "die".

jonathon

-- 
For unsubscribe instructions e-mail to: users+h...@global.libreoffice.org
Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette
List archive: http://listarchives.libreoffice.org/global/users/
All messages sent to this list will be publicly archived and cannot be deleted

Reply via email to