We have been generating spelling dictionaries using all the "words" in our data; this enables us to capture names, for example. But when the dictionary gets large, spell:suggest seems to slow down a lot. We've noticed this with a dictionary containing approx. 2M+ words (yes there is a lot of junk in there). My questions:
1) is this expected? 2) I've considered limiting the dictionary size by using only words that occur more than N times. But the way we have been building our dictionary is: xdmp:document-insert("/spelling/spelling-dictionary.xml", spell:make-dictionary(cts:field-words("body")), xdmp:default-permissions(), ("http://marklogic.com/xdmp/documents", "http://marklogic.com/xdmp/spell")) and word-lexicons don't include any frequency information. There must be frequency info stored somewhere in ML in order for it to be able to make its relevance calculations. Is that exposed anywhere in the API? Is there some other approach that would work here? A ready-made dictionary crafting module perhaps? -- Michael Sokolov Engineering Director www.ifactory.com _______________________________________________ General mailing list General@developer.marklogic.com http://developer.marklogic.com/mailman/listinfo/general