Re: different score from different replica of same shard

2021-01-13 Thread Walter Underwood
Yes, check performance before turning on the stats cache in prod. When we tested the LRUStatsCache in 6.6.2, searches were 11X slower. It should be possible to do distributed IDF with little extra overhead. Infoseek was doing that in 1995 and the patent on the technique has expired. wunder

Re: different score from different replica of same shard

2021-01-13 Thread Vincent Brehin
Hallo Bernd und Markus, A very instructive article, by the creator of TLOG mode (introduced in 7.0, btw): https://medium.com/@caomanhdat317/indexing-flow-of-solrcloud-sharding-distributed-systems-1-bba411bf8994 It helped me when architecting our replication policy. Not an easy matter, it's a

Re: different score from different replica of same shard

2021-01-13 Thread Markus Jelsma
Hallo Bernd, I see the different replica types in the 7.1 [1] manual but not in the 6.6. ExactStatsCache should work in 6.6, just add it to solrconfig.xml, not the request handler [1]. It will slow down searches due to added overhead. Regards, Markus [1]

Re: different score from different replica of same shard

2021-01-13 Thread Bernd Fehling
Hello Markus, thanks a lot. Is TLOG also for SOLR 6.6.6 or only 8.x and up? I will first try ExactStatsCache. Should be added as invariant to request handler, right? Comparing the replica index directories they have different size and the index version and generation is different. Also Max

Re: different score from different replica of same shard

2021-01-13 Thread Markus Jelsma
Hello Bernd, This is normal for NRT replicas, because the way segments are merged and deletes are removed is not synchronized between replicas. In that case counts for TF and IDF and norms become slightly different. You can either use ExactStatsCache that fetches counts for terms before scoring,

different score from different replica of same shard

2021-01-13 Thread Bernd Fehling
Hello list, a question for better understanding scoring of a shard in a cloud. I see different scores from different replicas of the same shard. Is this normal and if yes, why? My understanding until now was that replicas are always the same within a shard and the same query to each replica