[
https://issues.apache.org/jira/browse/SOLR-6266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14144590#comment-14144590
]
Andrzej Bialecki commented on SOLR-6266:
-----------------------------------------
Hi Joel, Karol and I work together, so I thought I'd chime in.
bq. We'll need to also figure out where to place the CAPIServer so there is
only one per node.
I think there is no such place for "global" components in Solr yet, the only
special component that is global being the CoreAdminHandler. It would be a nice
feature, but it's outside the scope of this issue.
So, if you can't have what you like you have to like what you have ;) This
means that for now the only option is to run an instance of CAPIServer per
collection.
bq. From my understanding the CAPIServer is listening on an ip/port. Couchbase
can be configured to replicate a bucket to a specific host and port.
Karol is working now on using the Couchbase REST API to configure Couchbase
automatically to send docs to a particular instance of CAPIServer that is
active. This will eliminate the need for manual configuration on the Couchbase
end, and will allow to re-target the replication to any other instance that
becomes active, should the current instance of CAPIServer disappear.
Regarding running of CAPIServers on all replicas: with the auto-configuration
mechanism as described above it's not needed, it's enough to activate a single
instance per collection, using e.g. always the first shard's leader. If this
node goes down, another leader will be elected and the CAPIServer instance will
activate there and register itself with Couchbase.
Couchbase always sends all changes for a bucket to a replica, so if you had in
mind an optimization where each shard would get only its own documents then it
wouldn't work - CAPIServer-s would get all documents anyway and they would have
to discard (N-1)/N docs - so this would only create heavier load on Couchbase
and Solr.
If we ran multiple active CAPIServer-s on replicas it wouldn't work right
either - copies of the same documents would be received multiple times, and
while they would be correctly re-routed to the right shards, each shard would
receive multiple copies and the ordering would be non-deterministic - not so
important for adds but crucial for a mix of adds / deletes.
> Couchbase plug-in for Solr
> --------------------------
>
> Key: SOLR-6266
> URL: https://issues.apache.org/jira/browse/SOLR-6266
> Project: Solr
> Issue Type: New Feature
> Reporter: Varun
> Assignee: Joel Bernstein
> Attachments: solr-couchbase-plugin.tar.gz,
> solr-couchbase-plugin.tar.gz
>
>
> It would be great if users could connect Couchbase and Solr so that updates
> to Couchbase can automatically flow to Solr. Couchbase provides some very
> nice API's which allow applications to mimic the behavior of a Couchbase
> server so that it can receive updates via Couchbase's normal cross data
> center replication (XDCR).
> One possible design for this is to create a CouchbaseLoader that extends
> ContentStreamLoader. This new loader would embed the couchbase api's that
> listen for incoming updates from couchbase, then marshal the couchbase
> updates into the normal Solr update process.
> Instead of marshaling couchbase updates into the normal Solr update process,
> we could also embed a SolrJ client to relay the request through the http
> interfaces. This may be necessary if we have to handle mapping couchbase
> "buckets" to Solr collections on the Solr side.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]