luyi created SOLR-13106:
---------------------------

             Summary: Multiple mlt.fl does not work well if the termvectors is 
repeated
                 Key: SOLR-13106
                 URL: https://issues.apache.org/jira/browse/SOLR-13106
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
          Components: MoreLikeThis
    Affects Versions: 5.5.6
            Reporter: luyi


for an example:

my data is:

{ "id":"100079750",

"title":" "I like cat, don't like dog",

"tags":["cat"],

"desc":["my cat photo"]

}

by the way title and desc's Tokenizer is IK.

and filed tags' type is text_ws.

while using mlt.fl=title,tags,desc with parameters debugQuery

the result shows:

"interestingTerms":[ "desc:my",1.0, "desc:photo",1.0, "desc:don",1.0, 
"title:dog",1.0, "desc:cat",1.0, "title:like",1.0],

"debug":{

"rawquerystring":"id:61",

"querystring":"id:61",

"parsedquery":"desc:my desc:photo desc:don title:dog desc:cat title:like", 
"parsedquery_toString":"desc:my desc:photo desc:don title:dog desc:cat 
title:like",

......

look at the word cat

it appears in field tags, desc and title,

but the result shows  the word cat only used in field desc and was ignored in 
field tags and title.

Finally, I found the reason when the word is repeated in more than one field.It 
will only be used in one field to do the work.

otherwise sometimes word is only in field tags, but while doing the mlt, the 
word was shows as other field such as title or desc, in fact there is never 
appear in these fields!

 

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to