Getting docs that have match on a search phrase is easy (using case-, punctuation-, white-space-, insensitive options), and finding docs that have the highest frequency for the words in the search phrase is easy (cts:word-query and a sequence of terms), but I want to find docs that most closely match the search phrase.
For example, if I have a doc that has this text in it: "Mary had a little lamb whose fleece was white as snow" If I search using "mary had a little lamb whose fleece was white as SNOW!!!" a cts:word-query would match if I sent the entire phrase and used all the "insensitive" options. If I search by tokenizing the phrase into ("mary", "had", "little", "lamb", "fleece", "white", "snow") I will get the doc that has the highest frequency of those words (and weighted according to doc size), which may or may not be my "Mary had a little lamb doc". And if I search for "Jane had a little lamb whose fleece was white as snow" the Mary doc won't match because the phrase doesn't match, and a tokenized words search probably won't match because some other doc with "Jane" and "snow" or whatever would be higher priority. I can try to use a near query of all the words except "Jane" isn't in the doc so there's be no match for my Mary doc. What I want is the doc that has a phrase that most closely matches the search phrase, even if I drop, replace, or introduce an incorrect word. And I mean more than just spelled wrong. You can see that "Jane had a little lamb whose fleece was white as snow" is really close to "Mary had a little lamb whose fleece was white as snow" but I can't quite figure out how to get MarkLogic to determine that quickly since the phrase won't match and tokenized words won't necessarily give me the best relevance. I can get all the permutations of the phrase (every word with all the other words in all combinations) and OR them together but search performance suffers after just a few permutations. Anyone know how to do this? thanks, -Ryan
_______________________________________________ General mailing list General@developer.marklogic.com http://community.marklogic.com/mailman/listinfo/general