On Tue, 16 Dec 2014 00:13:29 -0800, <[email protected]> wrote:

>   Forgot to send this to the list, sorry about that.
>
> In addition to what is below, one more question: can we do multilanguage  
> searches? We have content in multiple languages, and the end user is  
> searching with keywords or phrases, and we'd like to hit all content,  
> even if we specify certain language for the query (for stemming  
> purposes). Is this possible? Or do we need to do a query, containing all  
> the terms / phrases in all the languages we support to get all the  
> possible results?
>
> (The other question, pasted here from below was "I'll confirm once more  
> that there is no way to override in configuration that we'd want the  
> stemming to be case insensitive, is that right?")
>
> Ville
>

My recommendation is to use "unstemmed" for multi-lingual searches,
as it doesn't make a whole lot of sense to have, say, French "chats"
match English "chatting".  Since stemming is language-specific, you
have to specify what language to use for stemming. Sometimes folks
really want "no, really, stem it in every language I know and still match
if it is in some language I don't know". If that is your use case, then
you need to do some query expansion by or-ing together the stem
for every language that you have advanced stemming support for.
I'd also add in the unstemmed case too:

cts:or-query((
for $lang in ($every-language-I-have-licensed)
return cts:word-query($word, "lang="||$lang,
cts:word-query($word,"unstemmed")
)


//Mary
_______________________________________________
General mailing list
[email protected]
http://developer.marklogic.com/mailman/listinfo/general

Reply via email to