reminds me of "and the longest word in the English language is ... "
or is it supercalifragilisticespialidocious ;-) https://www.youtube.com/watch?v=tRFHXMQP-QU From: Kracked_P_P---webmaster <webmas...@krackedpress.com> Date: Thu, May 22, 2014 at 8:58 AM Subject: Re: [libreoffice-users] Re: Spell Check Dictionary To: users@global.libreoffice.org There are 797866 lines in the .dic file with the top one the number of words. The rest of the lines are one word each. The .dic file treats each line, except the first, as an individual word. Each line is a correct spelling of a word. The first part of the list are the capitalized words and the rest are the lowercased ones. "timed" and "timing" are two forms of a single root word and are not considered the same word as "time". If you create a word list of a document, for all of the words used, time, timed, and timing, are three individually listed words. Just because they share the same root word does not mean they are the same word. Also, for a spell checker, a word that has the first letter uppercased and a word with that same letter lowercased are treated differently. When not as the first word in a sentence, there are words that are allowed, or even need the first letter to be uppercased, while other will be misspelled if the first letter is uppercased. That is defined in the spell checking .dic file. You can either take a word and list each version or you can figure out all the control "options" to follow that word so it would also define all of those prefixed and suffixed versions of that word. Since I do not know those control codes, I listed each form or version of the word out in the list so I could also give a "good" word count. So the 797,865 words in the .dic file is correct. Would you like to deal with my unpublished 3,068,588 word .dic file that has even more versions and correct spellings of "en_US" words? This contains many, many, suffix and prefix versions that are rarely seen but technically spelled correctly. I just created that version to see how massive it could go. But, I will not publish it as a single dictionary. It would be divided up into "common" and "rare" files to be enabled/disabled as the user would choose. For now, the spell checking extension project is not going to be continued till a lot of other projects are finished - LO projects and many more non-LO projects. -- To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org Problems? http://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/ Posting guidelines + more: http://wiki.documentfoundation.org/Netiquette List archive: http://listarchives.libreoffice.org/global/users/ All messages sent to this list will be publicly archived and cannot be deleted