RE: Can I simplify this bit of query boosting?

Trevor Nicholls Sun, 14 May 2023 22:24:46 -0700

Thanks for this tip, it looks like it might do the biz (although I'm finding 
choosing good values for the constants a bit of a marathon exercise).


cheers
T

-----Original Message-----
From: Michael Sokolov <[email protected]> 
Sent: Friday, 12 May 2023 08:01
To: [email protected]
Subject: Re: Can I simplify this bit of query boosting?

You might also want to have a look at FeatureField. This can be used to 
associate a score with a particular term.

On Thu, May 11, 2023 at 1:13 PM Hrvoje Lončar <[email protected]> wrote:
>
> I had a situation when i wanted to sort a list of articles based on 
> the amount of data entered. For example, article having a photo, 
> description, ingredients should perform better comparing to one having 
> only name and photo.
> For that purpose I created a numeric field that holds calculated value 
> named completeness. Later when executing a query, this number is used 
> as a sort modifier - in my case by using reverse order.
> My project is based on Hibernate Search, so I guess it's not that I 
> can put here a code snippet. This numeric value does not have to be 
> 1st sort modifier. First you put the main sort rule and then you can 
> refine sort with this numeric value.
> I hope it helps - at least to give you an idea which way to go.
> BR,
> Hrvoje
>
> On Thu, 11 May 2023, 15:44 Trevor Nicholls, 
> <[email protected]>
> wrote:
>
> > Hi, I've hit a wall here.
> >
> >
> >
> > In brief, users search a library of documents. Every indexed 
> > document has a version number field which is always populated for 
> > release notes, sometimes for other docs. Every document also has a 
> > category field which is how release notes are identified, among other 
> > content types.
> >
> >
> >
> > The requirement is to make sure that release notes are boosted 
> > relative to other content, and that release notes with higher 
> > versions are boosted more than those with lower versions.
> >
> >
> >
> > I've currently implemented a crude method to achieve this, and the 
> > crucial part of the process is here:
> >
> >
> >
> >   // have IndexReader reader, IndexSearcher searcher, Analyzer 
> > analyzer, String userQuery
> >
> >   QueryParser parser = new QueryParser( "content", analyzer );
> >
> >   parser.setDefaultOperator( QueryParserBase.AND_OPERATOR );
> >
> >   BooleanQuery query = new BooleanQuery.Builder()
> >
> >      .add( parser.parse( userQuery ), Occur.MUST )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:9*" ), 90.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:8*" ), 80.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:7*" ), 70.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:6*" ), 60.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:5*" ), 50.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:4*" ), 40.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:3*" ), 30.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:2*" ), 20.0f ), Occur.SHOULD )
> >
> >      .add( new BoostQuery( parser.parse( "category:relnotes 
> > version:1*" ), 10.0f ), Occur.SHOULD )
> >
> >      .build();
> >
> >
> >
> > I found through experimentation that the boost factors are not 
> > multiplicative (as most of the explanations on the web implied) but 
> > are simply added to the score. If I've misunderstood how boosting 
> > works, please enlighten me!
> >
> > The versions and boost factors above are arbitrary just to keep the 
> > example simple; in reality the versions cover a much wider range and 
> > the boost values do too.
> >
> >
> >
> > This is working to a degree. But it's not granular enough, I really 
> > want the boost factor to be calculated directly from the version 
> > value, if that is possible.
> >
> > I also imagine doing it this way makes searches quite expensive.
> >
> >
> >
> > How could I improve this?
> >
> >
> >
> > cheers
> >
> > T
> >
> >
> >
> >
> >
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

RE: Can I simplify this bit of query boosting?

Reply via email to