Hi Ashwin,

What happens if you reduce the number of rows requested? Do you really need 500 results each time? I think this will ask for 500 results from *each shard* too. https://solr.apache.org/guide/8_7/pagination-of-results.html

Also it looks like you mean boost=def(boostFieldA,1) not boost:def(boostFieldA,1), am I right?

Cheers

Charlie

On 19/01/2022 02:43, Ashwin Ramesh wrote:
Gentle ping! Promise it's my final one! :)

On Thu, Jan 13, 2022 at 8:01 AM Ashwin Ramesh<[email protected]>  wrote:

Hi everyone,

I have a few questions about how we can improve our solr query
performance, especially for boosts (BF, BQ, boost, etc).

*System Specs:*
Solr Version: 7.7.x
Heap Size: 31gb
Num Docs: >100M
Shards: 8
Replication Factor: 6
Index is completely mapped into memory


Example query:
{
q=hello world
qf=title description keywords
pf=title^0.5
ps=0
fq=type:P
boost:def(boostFieldA,1) // boostFieldA is docValue float type
bf=mul(termfreq(termScoreFieldB,$q),1000.0) // termScoreFieldB is a
textField. No docValue, just indexed
rows:500
fl=id,score
}

numFound: >21M
qTime: 800ms

Experimentation of params:

    - When I remove the boost parameter, the qTime drops to 525ms
    - When I remove the bf parameter, the qTime dropes to 650ms
    - When I remove both the boost & bf parameters, the qTime drops to
    400ms


Questions:

    1. Is there any way to improve the performance of the boosts (specific
    field types, etc)?
    2. Will sharding further such that each core only has to score a
    smaller subset of documents help with query performance?
    3. Is there any performance impact when boosting/querying against
    sparse fields, both indexed=true or docValues=true?
    4. It seems the base case scoring is 400ms, which is already quite
    high. Is this because the query (hello world) implicitly gets parsed as
    (hello OR world)? Thus it would be more computationally expensive?
    5. Any other advice :) ?


Thanks in advance,

Ash







--
Charlie Hull - Managing Consultant at OpenSource Connections Limited
Founding member of The Search Network <http://www.thesearchnetwork.com> and co-author of Searching the Enterprise <https://opensourceconnections.com/wp-content/uploads/2020/08/ES_book_final_journal_version.pdf>
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828

OpenSource Connections Europe GmbH | Pappelallee 78/79 | 10437 Berlin
Amtsgericht Charlottenburg | HRB 230712 B
Geschäftsführer: John M. Woodell | David E. Pugh
Finanzamt: Berlin Finanzamt für Körperschaften II

--
This email has been checked for viruses by AVG.
https://www.avg.com

Reply via email to