[EMAIL PROTECTED] wrote:
Roger,Thank you to Andre, Jonathan, Alex, and Jim!!!
Your great suggestions and word lists have provided me the necessary ingredients to achieve my goal. I will probably post further questions for optimizing the speed of word comparisons. If you have ideas or a script that works well, please post it if you don't mind. I appreciate your help!
this was a very useful trigger for me. I've had a half-completed project to write some word-game programs that got left behind a couple of months ago.
One of the things I needed then was a spellchecker; I looked briefly at the Mozilla spellchecker (but didn't like their dictionary - seemed to have a lot of junk in it for my purposes). That was when I found the OpenOffice dictionaries, and looked at them enough to figure it would take some work (and in particular, expanding their downloadable dictionary to a simple word list would take a c compiler - which set of my allergy to using C :-)
I looked at it again, and decided I could dirty my hands for 5 minutes, downloaded the MySpell package, compiled the unmunch program to convert from dict+affix to simple word list.
Given that word list (162K words, 1.75Mbytes), I tried the simple brute force method, namely
put the millisecs into tStart
put tWords into field "inField"
put 0 into t
repeat for each word w in tWords
add 1 to t
replace "." with empty in w -- probably more of these should be done
replace "," with empty in w
replace "!" with empty in w
if w is not among the words of gWords then
set the textstyle of word t of field "inField" to "bold"
end if
end repeat
This took on average 8 millisecs per word in tWords. Perfectly adequate for small input "documents".
Then I tried a slightly more complex way: setup
put url ("file:" & tFile) into gWords
repeat for each word w in gWords
put 1 into gArray[w]
end repeat
and then
put tWords into field "inField"
put 0 into t
repeat for each word w in tWords
add 1 to t
replace "." with empty in w
replace "," with empty in w
replace "!" with empty in w
if gArray[w] <> 1 then
set the textstyle of word t of field "inField" to "bold"
end if
end repeat
This took 2 millisecs for 50 words, so would be reasonable for even large-ish documents.
I tried to put this sample stack onto RevOnline - but I'm having some problem connecting to the server, so you can find it instead at
www.tweedly.net/RunRev/SpellCheck.rev
www.tweedly.net/RunRev/allwords.dic
(remember the dic is 1.75M - don't download it unless you really want it !)
-- Alex.
-- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.300 / Virus Database: 265.7.6 - Release Date: 27/01/2005
_______________________________________________ use-revolution mailing list [email protected] http://lists.runrev.com/mailman/listinfo/use-revolution
