After some research, it appears the following approach may help in this
situation and relieve the requirement of collocating indexes for Joins.  It
appears one drawback maybe the types of fields supported for the JOIN field.

https://solr.apache.org/guide/8_8/other-parsers.html#cross-collection-join

Matt

On Wed, Jun 30, 2021 at 11:59 AM Matt Kuiper <[email protected]> wrote:

> Hi Solr Group,
>
> I am not sure the following is a viable use-case, welcoming input and any
> implementation recommendations.
>
> I would like to perform joins over two sharded collections.  Where docs
> are routed to specific shards based on a date range and are the same for
> shards in each collection.
>
> I understand that this means that the replicas from each collection that
> hold data to be joined need to be collated on the same Solr Server.   I
> have read solutions that use ADD REPLICA to add a Collection B replica to
> all SolrServers assuming Collection B has only one Shard.  For my use case
> I need Collection B to have multiple shards.
>
> *Collection A                Collection B              SolrServer *
> Shard1_2020              Shard1_2020           172.33.0.1:8983_solr
> Shard2_2021              Shard2_2021           172.33.0.2:8983_solr
> Shard3_2022              Shard3_2022           172.33.0.3:8983_solr
>
> I think my question comes down to how do I break shards by a date range,
> and do it in a way that both Collections A and B would be defined by the
> same date range?  If could reliably break shards by date, and know the date
> range of the shard, I think I could use ADD REPLICA api to align.
>
> Not sure a compositeId routing approach would work, but thinking an
> implicit id may be hard to manage over time.
>
> Is an approach like this viable, concerned a bit about
> maintenance concerns, other ideas to support this join?
>
> Note: I am considering this within Time series collections...
>
> Matt
>

Reply via email to