I had a situation when i wanted to sort a list of articles based on the amount of data entered. For example, article having a photo, description, ingredients should perform better comparing to one having only name and photo. For that purpose I created a numeric field that holds calculated value named completeness. Later when executing a query, this number is used as a sort modifier - in my case by using reverse order. My project is based on Hibernate Search, so I guess it's not that I can put here a code snippet. This numeric value does not have to be 1st sort modifier. First you put the main sort rule and then you can refine sort with this numeric value. I hope it helps - at least to give you an idea which way to go. BR, Hrvoje
On Thu, 11 May 2023, 15:44 Trevor Nicholls, <tre...@castingthevoid.com> wrote: > Hi, I've hit a wall here. > > > > In brief, users search a library of documents. Every indexed document has a > version number field which is always populated for release notes, sometimes > for other docs. Every document also has a category field which is how > release notes are identified, among other content types. > > > > The requirement is to make sure that release notes are boosted relative to > other content, and that release notes with higher versions are boosted more > than those with lower versions. > > > > I've currently implemented a crude method to achieve this, and the crucial > part of the process is here: > > > > // have IndexReader reader, IndexSearcher searcher, Analyzer analyzer, > String userQuery > > QueryParser parser = new QueryParser( "content", analyzer ); > > parser.setDefaultOperator( QueryParserBase.AND_OPERATOR ); > > BooleanQuery query = new BooleanQuery.Builder() > > .add( parser.parse( userQuery ), Occur.MUST ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:9*" ), > 90.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:8*" ), > 80.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:7*" ), > 70.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:6*" ), > 60.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:5*" ), > 50.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:4*" ), > 40.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:3*" ), > 30.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:2*" ), > 20.0f ), Occur.SHOULD ) > > .add( new BoostQuery( parser.parse( "category:relnotes version:1*" ), > 10.0f ), Occur.SHOULD ) > > .build(); > > > > I found through experimentation that the boost factors are not > multiplicative (as most of the explanations on the web implied) but are > simply added to the score. If I've misunderstood how boosting works, please > enlighten me! > > The versions and boost factors above are arbitrary just to keep the example > simple; in reality the versions cover a much wider range and the boost > values do too. > > > > This is working to a degree. But it's not granular enough, I really want > the > boost factor to be calculated directly from the version value, if that is > possible. > > I also imagine doing it this way makes searches quite expensive. > > > > How could I improve this? > > > > cheers > > T > > > > > >