Hi, an update: I've found another alternative while reading the valuable
post written by Doug.

In short, I should split the field "title" in two fields: title_notf and
title_phrase.
title_notf is without term frequency (with omitTermFreqAndPositions=true)
for matching queries and title_phrase with term frequency only for phrase
queries.

Though, not sure what are the counter-indication of this solution, what do
you think?

On Thu, Nov 29, 2018 at 5:22 PM Vincenzo D'Amore <v.dam...@gmail.com> wrote:

> Yep, that makes sense.
> And given that an omitTermFreq parameter does not exist, if I use
> omitTermFreqAndPositions then phrase queries won't work.
> So it seems there are no chances, the only way is write own similarity
> class.
>
> On Thu, Nov 29, 2018 at 4:15 PM Alexandre Rafalovitch <arafa...@gmail.com>
> wrote:
>
>> You are trying to use this in the field list, but the documentation
>> only talks about this being valid in the query clause itself. Which
>> perhaps makes it a bit less useful for your case, but does not look
>> like a bug.
>>
>> Regards,
>>    Alex.
>> On Thu, 29 Nov 2018 at 10:06, Vincenzo D'Amore <v.dam...@gmail.com>
>> wrote:
>> >
>> > Hi thanks for your prompt reply :)
>> >
>> > I thought the constant score should be the easiest way, unexpectedly
>> when I
>> > tried to specify constant score in qf field an exception has been
>> raised.
>> > So I've configured the constant score into solrconfig.xml file :
>> >
>> >   <!-- ultra leggero per le valutazioni dell'autofilter -->
>> >    <requestHandler name="probe" class="solr.SearchHandler">
>> >       <lst name="defaults">
>> >          <str name="df">combiField</str>
>> >          <str name="defType">edismax</str>
>> >          <str name="echoParams">none</str>
>> >          <float name="tie">1</float>
>> >          <int name="rows">0</int>
>> >          <str name="qf">combiField^3 title^=1 </str>
>> >          <str name="pf">combiField^3 title</str>
>> >          <str name="mm"><![CDATA[100%]]></str>
>> >          <int name="qs">2</int>
>> >          <int name="ps">5</int>
>> >          <str name="q.alt">*:*</str>
>> >       </lst>
>> >    </requestHandler>
>> >
>> > This is the exception:
>> >
>> > <?xml version="1.0" encoding="UTF-8"?>
>> > <response>
>> >
>> > <lst name="responseHeader">
>> >   <bool name="zkConnected">true</bool>
>> >   <int name="status">500</int>
>> >   <int name="QTime">5</int>
>> > </lst>
>> > <lst name="error">
>> >   <str name="msg">For input string: "=1"</str>
>> >   <str name="trace">java.lang.NumberFormatException: For input string:
>> "=1"
>> > at
>> sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:2043)
>> > at sun.misc.FloatingDecimal.parseFloat(FloatingDecimal.java:122)
>> > at java.lang.Float.parseFloat(Float.java:451)
>> > at java.lang.Float.valueOf(Float.java:416)
>> > at
>> >
>> org.apache.solr.util.SolrPluginUtils.parseFieldBoosts(SolrPluginUtils.java:540)
>> > at
>> >
>> org.apache.solr.search.DisMaxQParser.parseQueryFields(DisMaxQParser.java:71)
>> > at
>> >
>> org.apache.solr.search.ExtendedDismaxQParser$ExtendedDismaxConfiguration.&lt;init&gt;(ExtendedDismaxQParser.java:1608)
>> > at
>> >
>> org.apache.solr.search.ExtendedDismaxQParser.createConfiguration(ExtendedDismaxQParser.java:256)
>> > at
>> >
>> org.apache.solr.search.ExtendedDismaxQParser.&lt;init&gt;(ExtendedDismaxQParser.java:115)
>> > at
>> >
>> org.apache.solr.search.ExtendedDismaxQParserPlugin.createParser(ExtendedDismaxQParserPlugin.java:31)
>> > at
>> >
>> it.apache.solr.search.SynonymsEdismaxQParserPlugin.createParser(SynonymsEdismaxQParserPlugin.java:102)
>> > at org.apache.solr.search.QParser.getParser(QParser.java:363)
>> > at org.apache.solr.search.QParser.getParser(QParser.java:315)
>> > at
>> >
>> org.apache.solr.handler.component.QueryComponent.prepare(QueryComponent.java:159)
>> > at
>> >
>> org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:269)
>> > at
>> >
>> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:195)
>> > at org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)
>> > at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:711)
>> > at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:517)
>> > at
>> >
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:384)
>> > at
>> >
>> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:330)
>> > at
>> >
>> org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1629)
>> > at
>> >
>> org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:533)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
>> > at
>> >
>> org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
>> > at
>> >
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:190)
>> > at
>> >
>> org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1595)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:188)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1253)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:168)
>> > at
>> org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:473)
>> > at
>> >
>> org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1564)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:166)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1155)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
>> > at
>> >
>> org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:219)
>> > at
>> >
>> org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
>> > at
>> >
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>> > at
>> >
>> org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
>> > at
>> >
>> org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
>> > at org.eclipse.jetty.server.Server.handle(Server.java:530)
>> > at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:347)
>> > at
>> >
>> org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:256)
>> > at
>> > org.eclipse.jetty.io
>> .AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:279)
>> > at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:102)
>> > at org.eclipse.jetty.io.ChannelEndPoint$2.run(ChannelEndPoint.java:124)
>> > at
>> >
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:247)
>> > at
>> >
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.produce(EatWhatYouKill.java:140)
>> > at
>> >
>> org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)
>> > at
>> >
>> org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:382)
>> > at
>> >
>> org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:708)
>> > at
>> >
>> org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:626)
>> > at java.lang.Thread.run(Thread.java:748)
>> > </str>
>> >   <int name="code">500</int>
>> > </lst>
>> > </response>
>> >
>> > What do you think, is this a bug? Should I submit an issue?
>> >
>> > On Thu, Nov 29, 2018 at 2:03 PM Doug Turnbull <
>> > dturnb...@opensourceconnections.com> wrote:
>> >
>> > > I think the similarity way (setting k1 to 0) or a constant score
>> query are
>> > > probably the best ways. Omitting term freqs and position will also
>> remove
>> > > positions meaning phrase queries won’t work.
>> > >
>> > > This blog article might be useful for your use case. I discuss a
>> similar
>> > > prob.
>> > >
>> > >
>> > >
>> https://opensourceconnections.com/blog/2014/12/08/title-search-when-relevancy-is-only-skin-deep/
>> > >
>> > > Doug
>> > > On Thu, Nov 29, 2018 at 7:59 AM Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > > wrote:
>> > >
>> > > > Perhaps constant score would be useful here:
>> > > >
>> > > >
>> > >
>> http://lucene.apache.org/solr/guide/7_5/the-standard-query-parser.html#constant-score-with
>> > > >
>> > > > Also, all the options like omitTermFreqAndPositions are described
>> > > > here:
>> > > >
>> > >
>> http://lucene.apache.org/solr/guide/7_5/field-type-definitions-and-properties.html#field-default-properties
>> > > >
>> > > > Regards,
>> > > >    Alex.
>> > > > On Thu, 29 Nov 2018 at 05:43, Vincenzo D'Amore <v.dam...@gmail.com>
>> > > wrote:
>> > > > >
>> > > > > Hi all,
>> > > > >
>> > > > > I have a relevancy problem, I suppose to know a solution for this
>> > > problem
>> > > > > but I would like to know if in your experience there is a better
>> one.
>> > > > >
>> > > > > For example I have two documents which have the "termA" in their
>> field
>> > > > > "title", the former has the "termA" repeated more times but the
>> latter
>> > > > has
>> > > > > the term only once. When searching for "termA" the former has
>> bigger
>> > > > score
>> > > > > due to TF/IDF.
>> > > > >
>> > > > > Both the documents are fairly similar so I don't want that term
>> > > frequency
>> > > > > in the title boosts the score.
>> > > > > The only solution I know to flatten the score when there is a
>> > > difference
>> > > > in
>> > > > > term frequency is having configured my own similarity class in the
>> > > schema
>> > > > > that returns constantly 1 for term frequency.
>> > > > >
>> > > > > I'm curious to know if you know another way, in the beginning I
>> thought
>> > > > to
>> > > > > omit term frequency at index time.
>> > > > >
>> > > > > Looking around I've found an old issue
>> > > > > https://issues.apache.org/jira/browse/LUCENE-1561 where omitTF
>> has
>> > > been
>> > > > > renamed into omitTermFreqAndPositions.
>> > > > >
>> > > > > What I've understood is that omitting term frequency imply also
>> remove
>> > > > term
>> > > > > positions, so very likely omitting term frequency is not what I'm
>> > > looking
>> > > > > for.
>> > > > >
>> > > > > As said, I'm curious to know if you know another way, and as usual
>> > > > thanks i
>> > > > > advance for your time e for your patience.
>> > > > >
>> > > > > Best regards,
>> > > > > Vincenzo
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Vincenzo D'Amore
>> > > >
>> > > --
>> > > CTO, OpenSource Connections
>> > > Author, Relevant Search
>> > > http://o19s.com/doug
>> > >
>> >
>> >
>> > --
>> > Vincenzo D'Amore
>>
>
>
> --
> Vincenzo D'Amore
>
>

-- 
Vincenzo D'Amore

Reply via email to