Hello,
In the mergeIds function of QueryComponent, this
heap ShardFieldSortedHitQueue is used to order the ShardDoc. However, in
the *lessThan* function:
protected boolean lessThan(ShardDoc docA, ShardDoc docB) {
// If these docs are from the same shard, then the relative order
// is how they appeared in the response from that shard.
if (Objects.equals(docA.shard, docB.shard)) {
// if docA has a smaller position, it should be "larger" so it
// comes before docB.
// This will handle sorting by docid within the same shard
// comment this out to test comparators.
return !(docA.orderInShard < docB.orderInShard);
}
// run comparators
final int n = comparators.length;
int c = 0;
for (int i = 0; i < n && c == 0; i++) {
c =
(fields[i].getReverse())
? comparators[i].compare(docB, docA)
: comparators[i].compare(docA, docB);
}
// solve tiebreaks by comparing shards (similar to using docid)
// smaller docid's beat larger ids, so reverse the natural ordering
if (c == 0) {
c = -docA.shard.compareTo(docB.shard);
}
return c < 0;
}
The last tie-breaking logic is comparing ShardDoc.shard:
// solve tiebreaks by comparing shards (similar to using docid)
// smaller docid's beat larger ids, so reverse the natural ordering
if (c == 0) {
c = -docA.shard.compareTo(docB.shard);
}
Here ShardDoc.shard contains node ip as well as shard name, for example:
http://127.0.0.1:8983/solr/my_collection_shard1_replica_n1
Consider this setup: 1 collection with 2 shard 2 replica running on a 2
nodes cluster. For the same query, we may have documents coming from the
following core combinations:
1. http://node1_ip:8983/solr/my_collection_shard1_replica_n1 +
http://node2_ip:8983/solr/my_collection_shard2_replica_n2
2. http://node2_ip:8983/solr/my_collection_shard1_replica_n2 +
http://node1_ip:8983/solr/my_collection_shard2_replica_n1
Hence the same request may have different document rankings when there are
documents from both shards with the same scores. This can get worse with
more nodes/shards/replicas.
I'm wondering if we should just use the shard name for tie breaking instead
(no node ip), if that's possible
Thank you,
Yue