Hey Maarten, I would use the "explain":true option to see just why your documents are being scored higher than others. MoreLikeThis using the same fulltext scoring as far as I know, so term position would affect score.
http://lucene.apache.org/core/3_0_3/api/contrib-queries/org/apache/lucene/search/similar/MoreLikeThis.html Justin On Wednesday, January 8, 2014 3:04:47 AM UTC-5, Maarten Roosendaal wrote: > > Hi, > > I have a question about why the 'more like this' algorithm scores > documents higher than others, while they are (at first glance) the same. > > What i've done is index wishlist-documents which contain 1 property: > product_id, this property contains an array of product_id's (e.g. [1234, > 4444, 5555, 6666]. What i'm trying to do is find similair wishlist for a > given wishlist with id x. The MLT API seems to work, it returns other > documents which contain at least 1 of the product_id's from the original > list. > > But what is see is that, for example. i get 10 hits, the first 6 hits > contain the same (and only 1) product_id, this product_id is present in the > original wishlist. What i would expect is that the score of the first 6 is > the same. However what i see is that only the first 2 have the same, the > next 2 a lower score and the next 2 even lower. Why is this? > > Also, i'm trying to write the MLT API as an MLT query, but somehow it > doesn't work. I would expect that i need to take the entire content of the > original product_id property and feed is as input for the 'like_text'. The > documentation is not very clear and doesn't provide examples so i'm a > little lost. > > Hope someone can give some pointers. > > Thanks, > Maarten > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a0e9a58d-89e7-4084-b7ed-7f34c8514ce5%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.
