Hello
I’ve received user feedback regarding collocated join
<https://solr.apache.org/guide/solr/latest/query-guide/join-query-parser.html#joining-multiple-shard-collections>
.

The documents in the system are divided into 2 collections - text +
> frequently used metadata and extended metadata (1 to 1).
> There are expected to be 40 million documents, which are usually divided
> into 8 shards, with the average size of a document with extended metadata
> being 2KB.
> That is, without sharding, such an index would have been a very rough
> 80GB, but in fact it's only 10GB 😎
> This greatly helps mmap for searching the main collection (it's hard to
> assess the impact, but it's unlikely to have gotten worse 😅)
> All JOIN queries are expected to be faster (they used to take up to a
> minute, now - up to a dozen seconds) almost proportionally to the degree of
> sharding.
> As a bonus, we also get fast indexing of documents (we didn't measure it
> on real data here, as it wasn't a problematic area, but the traffic has
> become much lower according to network monitoring)
> It's been in commercial operation for over a year, by the way 😉



Is anyone else using it in production?

On Thu, Jun 1, 2023 at 12:19 PM Mikhail Khludnev <[email protected]> wrote:

> Hello,
> I think I'm done with code and tests. This feature allowsto join
> collections with multiple shards on both sides for the sake of scalability.
> It requires me to introduce AffinityPlacementPlugin.withCollectionShards
> where I feel much uncertainty and need good advice.
> https://issues.apache.org/jira/browse/SOLR-16717
> https://github.com/apache/solr/pull/1550/
>
> Thanks!
> --
> Sincerely yours
> Mikhail Khludnev
>


-- 
Sincerely yours
Mikhail Khludnev

Reply via email to