Well, I've tried much larger values than 8, and it still doesn't seem to
do the job ?
For now, assume my users are searching for exact sub strings of a real
title.
Tom
On 13/01/17 16:22, Walter Underwood wrote:
I use a boost of 8 for title with no boost on the content. Both Infoseek and
Inktomi settled on the 8X boost, getting there with completely different
methodologies.
You might not want the title to completely trump the content. That causes some
odd anomalies. If someone searches for “ice age 2”, do you really want every
title with “2” to come before “ice age two”? Or a search for “steve jobs” to
return every article with “job” or “jobs” in the title first?
Also, use “edismax”, not “dismax”. Dismax was obsolete in Solr 3.x, five years
ago.
wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/ (my blog)
On Jan 13, 2017, at 7:10 AM, Tom Chiverton <t...@extravision.com> wrote:
I have a few hundred documents with title and content fields.
I want a match in title to trump matches in content. If I search for "connected
vehicle" then a news article that has that in the content shouldn't be ranked higher
than the page with that in the title is essentially what I want.
I have tried dismax with qf=title^2 as well as several other variants with the standard query parser
(like q="title:"foo"^2 OR content:"foo") but documents without the search term
in the title still come out before those with the term in the title when ordered by score.
Is there something I am missing ?
From the docs, something like q=title:"connected vehicle"^2 OR content:"connected
vehicle" should have worked ? Even using ^100 didn't help.
I tried with the dismax parser using
"q": "Connected Vehicle",
"defType": "dismax",
"indent": "true",
"qf": "title^2000 content",
"pf": "pf=title^4000 content^2",
"sort": "score desc",
"wt": "json",
but that was not better. if I remove content from pf/qf then documents seem to
rank correctly.
Example query and results (content omitted) : http://pastebin.com/5EhrRJP8
<http://pastebin.com/5EhrRJP8> with managed-schema http://pastebin.com/mdraWQWE
<http://pastebin.com/mdraWQWE>
--
<spacer.gif>
<spacer.gif>
<spacer.gif>
Tom Chiverton
Lead Developer
<spacer.gif>
e: <mailto:t...@extravision.com>t...@extravision.com
<mailto:t...@extravision.com>
p: 0161 817 2922
t: @extravision <http://www.twitter.com/extravision>
w: <http://www.extravision.com/>www.extravision.com
<http://www.extravision.com/>
<spacer.gif>
<outlook-logo.gif> <http://www.extravision.com/>
<spacer.gif>
Registered in the UK at: 107 Timber Wharf, 33 Worsley Street, Manchester, M15
4LD.
Company Reg No: 05017214 VAT: GB 824 5386 19
This e-mail is intended solely for the person to whom it is addressed and may
contain confidential or privileged information.
Any views or opinions presented in this e-mail are solely of the author and do
not necessarily represent those of Extravision Ltd.
<spacer.gif>
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________