dsmiley commented on PR #3418: URL: https://github.com/apache/solr/pull/3418#issuecomment-3300226161
> IMO, If the data is independently sharded and shard-level scoring doesn't matter in overall query relevance, it may not make sense to first combine shard results based on their original scores and then apply RRF per query (as done in Way 2). In such cases, users may prefer Way 1. I disagree. Only having per-shard RRF quickly devolves to shard interleaving as the shard count increases, since at the coordination/aggregation, there's no global overall ranking measure left anymore. The results from each shard are ultimately treated as equivalent across the shards. Consequently, if say the _real_ RRF best result pointed clearly to one document (best score of both sub-queries, lets say), then in the per-shard RRF it'd merely be _arbitrarily_ somewhere in the top-20 if say there are 20 shards. Nobody would want that. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org