On Tue, 16 Dec 2014 00:13:29 -0800, <[email protected]> wrote: > Forgot to send this to the list, sorry about that. > > In addition to what is below, one more question: can we do multilanguage > searches? We have content in multiple languages, and the end user is > searching with keywords or phrases, and we'd like to hit all content, > even if we specify certain language for the query (for stemming > purposes). Is this possible? Or do we need to do a query, containing all > the terms / phrases in all the languages we support to get all the > possible results? > > (The other question, pasted here from below was "I'll confirm once more > that there is no way to override in configuration that we'd want the > stemming to be case insensitive, is that right?") > > Ville >
My recommendation is to use "unstemmed" for multi-lingual searches, as it doesn't make a whole lot of sense to have, say, French "chats" match English "chatting". Since stemming is language-specific, you have to specify what language to use for stemming. Sometimes folks really want "no, really, stem it in every language I know and still match if it is in some language I don't know". If that is your use case, then you need to do some query expansion by or-ing together the stem for every language that you have advanced stemming support for. I'd also add in the unstemmed case too: cts:or-query(( for $lang in ($every-language-I-have-licensed) return cts:word-query($word, "lang="||$lang, cts:word-query($word,"unstemmed") ) //Mary _______________________________________________ General mailing list [email protected] http://developer.marklogic.com/mailman/listinfo/general
